Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

P29122 (PCSK6_HUMAN) Reviewed, UniProtKB/Swiss-Prot

Last modified April 16, 2014. Version 147. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (6) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Alt products·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Proprotein convertase subtilisin/kexin type 6

EC=3.4.21.-
Alternative name(s):
Paired basic amino acid cleaving enzyme 4
Subtilisin-like proprotein convertase 4
Short name=SPC4
Subtilisin/kexin-like protease PACE4
Gene names
Name:PCSK6
Synonyms:PACE4
OrganismHomo sapiens (Human) [Reference proteome]
Taxonomic identifier9606 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo

Protein attributes

Sequence length969 AA.
Sequence statusComplete.
Sequence processingThe displayed sequence is further processed into a mature form.
Protein existenceEvidence at protein level

General annotation (Comments)

Function

Likely to represent an endoprotease activity within the constitutive secretory pathway, with unique restricted distribution in both neuroendocrine and non-neuroendocrine tissues and capable of cleavage at the RX(K/R)R consensus motif.

Catalytic activity

Release of mature proteins from their proproteins by cleavage of Arg-Xaa-Yaa-Arg-|-Zaa bonds, where Xaa can be any amino acid and Yaa is Arg or Lys.

Cofactor

Calcium Potential.

Subunit structure

The PACE4A-I precursor protein seems to exist in the reticulum endoplasmic as both a monomer and a dimer-sized complex whereas mature PACE4A-I exists only as a monomer, suggesting that propeptide cleavage affects its tertiary or quaternary structure.

Subcellular location

Isoform PACE4A-I: Secreted.

Isoform PACE4A-II: Secreted.

Isoform PACE4C: Endoplasmic reticulum. Note: Not secreted, remains probably in zymogen form in endoplasmic reticulum.

Isoform PACE4CS: Endoplasmic reticulum. Note: Not secreted, remains probably in zymogen form in endoplasmic reticulum.

Isoform PACE4E-I: Endomembrane system; Peripheral membrane protein. Note: Retained intracellularly probably through a hydrophobic cluster in their C-terminus.

Isoform PACE4E-II: Endomembrane system; Peripheral membrane protein. Note: Retained intracellularly probably through a hydrophobic cluster in their C-terminus.

Isoform PACE4B: Secreted.

Tissue specificity

Each PACE4 isoform exhibitsa unique restricted distribution. Isoform PACE4A-I is expressed in heart, brain, placenta, lung, skeletal muscle, kidney, pancreas, but at comparatively higher levels in the liver. Isoform PACE4A-II is at least expressed in placenta. Isoform PACE4B was only found in the embryonic kidney cell line from which it was isolated. Isoform PACE4C and isoform PACE4D are expressed in placenta. Isoform PACE4E-I is expressed in cerebellum, placenta and pituitary. Isoform PACE4E-II is at least present in cerebellum.

Domain

The propeptide domain acts as an intramolecular chaperone assisting the folding of the zymogen within the endoplasmic reticulum. Isoform PACE4D lacks the propeptide domain.

Sequence similarities

Belongs to the peptidase S8 family.

Contains 1 homo B/P domain.

Contains 1 PLAC domain.

Ontologies

Keywords
   Cellular componentEndoplasmic reticulum
Membrane
Secreted
   Coding sequence diversityAlternative splicing
Polymorphism
   DomainRepeat
Signal
   LigandCalcium
   Molecular functionHydrolase
Protease
Serine protease
   PTMCleavage on pair of basic residues
Glycoprotein
Zymogen
   Technical termComplete proteome
Reference proteome
Gene Ontology (GO)
   Biological_processdetermination of left/right symmetry

Inferred from electronic annotation. Source: Ensembl

glycoprotein metabolic process

Inferred from direct assay PubMed 8218226. Source: BHF-UCL

nerve growth factor processing

Traceable author statement. Source: Reactome

nerve growth factor production

Inferred from direct assay PubMed 8615794. Source: BHF-UCL

neurotrophin TRK receptor signaling pathway

Traceable author statement. Source: Reactome

peptide hormone processing

Inferred from direct assay PubMed 9242664. Source: BHF-UCL

protein processing

Inferred from direct assay PubMed 8218226PubMed 9242664Ref.9. Source: BHF-UCL

proteolysis

Inferred from electronic annotation. Source: UniProtKB-KW

regulation of BMP signaling pathway

Traceable author statement PubMed 10467177. Source: BHF-UCL

secretion by cell

Inferred from direct assay PubMed 8615794. Source: BHF-UCL

zygotic determination of anterior/posterior axis, embryo

Inferred from electronic annotation. Source: Ensembl

   Cellular_componentGolgi lumen

Traceable author statement. Source: Reactome

cell surface

Inferred from direct assay PubMed 12535616. Source: BHF-UCL

endoplasmic reticulum

Inferred from electronic annotation. Source: UniProtKB-SubCell

extracellular matrix

Inferred from direct assay PubMed 12535616. Source: BHF-UCL

extracellular space

Inferred from direct assay Ref.7. Source: BHF-UCL

membrane

Inferred from electronic annotation. Source: UniProtKB-KW

   Molecular_functionendopeptidase activity

Inferred from direct assay PubMed 9242664. Source: BHF-UCL

heparin binding

Inferred from direct assay PubMed 12535616. Source: BHF-UCL

nerve growth factor binding

Inferred from direct assay PubMed 8615794. Source: BHF-UCL

serine-type endopeptidase activity

Inferred from direct assay PubMed 8218226PubMed 8615794Ref.7. Source: BHF-UCL

Complete GO annotation...

Alternative products

This entry describes 8 isoforms produced by alternative splicing. [Align] [Select]
Isoform PACE4A-I (identifier: P29122-1)

Also known as: PACE4;

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
Isoform PACE4A-II (identifier: P29122-2)

The sequence of this isoform differs from the canonical sequence as follows:
     680-693: AQSTPGSANILQTS → G
Isoform PACE4B (identifier: P29122-3)

Also known as: PACE4.1;

The sequence of this isoform differs from the canonical sequence as follows:
     471-471: K → KGAAVAFWWTIGWPWNV
     472-969: Missing.
Note: Probably enzymatically inactive.
Isoform PACE4C (identifier: P29122-4)

The sequence of this isoform differs from the canonical sequence as follows:
     621-652: KLKEWSLILYGTAEHPYHTFSAHQSRSRMLEL → DLETPVANQLTTEEREPGLKHVFRWQIEQELW
     653-969: Missing.
Note: Probably enzymatically inactive.
Isoform PACE4CS (identifier: P29122-5)

The sequence of this isoform differs from the canonical sequence as follows:
     621-623: KLK → NLD
     624-969: Missing.
Note: Probably enzymatically inactive.
Isoform PACE4D (identifier: P29122-6)

The sequence of this isoform differs from the canonical sequence as follows:
     1-167: Missing.
     621-664: KLKEWSLILY...PELEPPKAAL → DLETPVANQL...YHIVLITVAL
     665-969: Missing.
Note: Probably enzymatically inactive.
Isoform PACE4E-I (identifier: P29122-7)

The sequence of this isoform differs from the canonical sequence as follows:
     901-969: CDENCLSCAG...FCCRTCLLAG → YGPPGGERQA...AVGRHRAAAG
Isoform PACE4E-II (identifier: P29122-8)

The sequence of this isoform differs from the canonical sequence as follows:
     680-693: AQSTPGSANILQTS → G
     901-969: CDENCLSCAG...FCCRTCLLAG → YGPPGGERQA...AVGRHRAAAG

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Signal peptide1 – 6363 Potential
Propeptide64 – 14986
PRO_0000027110
Chain150 – 969820Proprotein convertase subtilisin/kexin type 6
PRO_0000027111

Regions

Domain496 – 634139Homo B/P
Domain931 – 96939PLAC
Region150 – 454305Catalytic
Region695 – 930236CRM (Cys-rich motif)
Motif553 – 5553Cell attachment site Potential

Sites

Active site2051Charge relay system By similarity
Active site2461Charge relay system By similarity
Active site4201Charge relay system By similarity
Site149 – 1502Cleavage; by autolysis

Amino acid modifications

Glycosylation2591N-linked (GlcNAc...) Potential
Glycosylation9141N-linked (GlcNAc...) Potential
Glycosylation9321N-linked (GlcNAc...) Potential

Natural variations

Alternative sequence1 – 167167Missing in isoform PACE4D.
VSP_005427
Alternative sequence4711K → KGAAVAFWWTIGWPWNV in isoform PACE4B.
VSP_005428
Alternative sequence472 – 969498Missing in isoform PACE4B.
VSP_005429
Alternative sequence621 – 66444KLKEW…PKAAL → DLETPVANQLTTEERFVSTP SILFHWSVYLSWSQYHIVLI TVAL in isoform PACE4D.
VSP_005434
Alternative sequence621 – 65232KLKEW…RMLEL → DLETPVANQLTTEEREPGLK HVFRWQIEQELW in isoform PACE4C.
VSP_005432
Alternative sequence621 – 6233KLK → NLD in isoform PACE4CS.
VSP_005430
Alternative sequence624 – 969346Missing in isoform PACE4CS.
VSP_005431
Alternative sequence653 – 969317Missing in isoform PACE4C.
VSP_005433
Alternative sequence665 – 969305Missing in isoform PACE4D.
VSP_005435
Alternative sequence680 – 69314AQSTP…ILQTS → G in isoform PACE4A-II and isoform PACE4E-II.
VSP_005436
Alternative sequence901 – 96969CDENC…CLLAG → YGPPGGERQATVSSKGVPGG QSLSASSPGAGEGMLHHPTV DRSPFTELLRGLRPFVHWMH ICWVPAVGRHRAAAG in isoform PACE4E-I and isoform PACE4E-II.
VSP_005437
Natural variant5021C → R.
Corresponds to variant rs1058260 [ dbSNP | Ensembl ].
VAR_051824

Experimental info

Sequence conflict6391T → I in BAA21624. Ref.6
Sequence conflict6391T → I in BAA21625. Ref.6
Sequence conflict6391T → I in BAA21626. Ref.6
Sequence conflict6391T → I in BAA21627. Ref.6

Sequences

Sequence LengthMass (Da)Tools
Isoform PACE4A-I (PACE4) [UniParc].

Last modified December 1, 1992. Version 1.
Checksum: A3599CC278D09B05

FASTA969106,420
        10         20         30         40         50         60 
MPPRAPPAPG PRPPPRAAAA TDTAAGAGGA GGAGGAGGPG FRPLAPRPWR WLLLLALPAA 

        70         80         90        100        110        120 
CSAPPPRPVY TNHWAVQVLG GPAEADRVAA AHGYLNLGQI GNLEDYYHFY HSKTFKRSTL 

       130        140        150        160        170        180 
SSRGPHTFLR MDPQVKWLQQ QEVKRRVKRQ VRSDPQALYF NDPIWSNMWY LHCGDKNSRC 

       190        200        210        220        230        240 
RSEMNVQAAW KRGYTGKNVV VTILDDGIER NHPDLAPNYD SYASYDVNGN DYDPSPRYDA 

       250        260        270        280        290        300 
SNENKHGTRC AGEVAASANN SYCIVGIAYN AKIGGIRMLD GDVTDVVEAK SLGIRPNYID 

       310        320        330        340        350        360 
IYSASWGPDD DGKTVDGPGR LAKQAFEYGI KKGRQGLGSI FVWASGNGGR EGDYCSCDGY 

       370        380        390        400        410        420 
TNSIYTISVS SATENGYKPW YLEECASTLA TTYSSGAFYE RKIVTTDLRQ RCTDGHTGTS 

       430        440        450        460        470        480 
VSAPMVAGII ALALEANSQL TWRDVQHLLV KTSRPAHLKA SDWKVNGAGH KVSHFYGFGL 

       490        500        510        520        530        540 
VDAEALVVEA KKWTAVPSQH MCVAASDKRP RSIPLVQVLR TTALTSACAE HSDQRVVYLE 

       550        560        570        580        590        600 
HVVVRTSISH PRRGDLQIYL VSPSGTKSQL LAKRLLDLSN EGFTNWEFMT VHCWGEKAEG 

       610        620        630        640        650        660 
QWTLEIQDLP SQVRNPEKQG KLKEWSLILY GTAEHPYHTF SAHQSRSRML ELSAPELEPP 

       670        680        690        700        710        720 
KAALSPSQVE VPEDEEDYTA QSTPGSANIL QTSVCHPECG DKGCDGPNAD QCLNCVHFSL 

       730        740        750        760        770        780 
GSVKTSRKCV SVCPLGYFGD TAARRCRRCH KGCETCSSRA ATQCLSCRRG FYHHQEMNTC 

       790        800        810        820        830        840 
VTLCPAGFYA DESQKNCLKC HPSCKKCVDE PEKCTVCKEG FSLARGSCIP DCEPGTYFDS 

       850        860        870        880        890        900 
ELIRCGECHH TCGTCVGPGR EECIHCAKNF HFHDWKCVPA CGEGFYPEEM PGLPHKVCRR 

       910        920        930        940        950        960 
CDENCLSCAG SSRNCSRCKT GFTQLGTSCI TNHTCSNADE TFCEMVKSNR LCERKLFIQF 


CCRTCLLAG 

« Hide

Isoform PACE4A-II [UniParc].

Checksum: BA240E3EEBCC862F
Show »

FASTA956105,121
Isoform PACE4B (PACE4.1) [UniParc].

Checksum: 10DB376359A7F1AF
Show »

FASTA48753,044
Isoform PACE4C [UniParc].

Checksum: 880D99278881942C
Show »

FASTA65271,771
Isoform PACE4CS [UniParc].

Checksum: 19BCB5350278C621
Show »

FASTA62368,238
Isoform PACE4D [UniParc].

Checksum: 46C1F64CAEA0E3EB
Show »

FASTA49754,900
Isoform PACE4E-I [UniParc].

Checksum: 31983E526116A67C
Show »

FASTA975106,674
Isoform PACE4E-II [UniParc].

Checksum: F16ABF9230DE5F01
Show »

FASTA962105,375

References

[1]"Identification of a second human subtilisin-like protease gene in the fes/fps region of chromosome 15."
Kiefer M.C., Tucker J.E., Joh R., Landsberg K.E., Saltman D., Barr P.J.
DNA Cell Biol. 10:757-769(1991) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORMS PACE4A-I AND PACE4B).
Tissue: Hepatoma and Kidney.
[2]"Identification of novel cDNAs encoding human kexin-like protease, PACE4 isoforms."
Tsuji A., Higashine K., Hine C., Mori K., Tamai Y., Nagamune H., Matsuda Y.
Biochem. Biophys. Res. Commun. 200:943-950(1994) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORMS PACE4C AND PACE4D).
Tissue: Placenta.
[3]Erratum
Tsuji A., Higashine K., Hine C., Mori K., Tamai Y., Nagamune H., Matsuda Y.
Biochem. Biophys. Res. Commun. 204:1381-1382(1994) [PubMed] [Europe PMC] [Abstract]
[4]"Identification of a novel PACE4 isoform, PACE4E."
Mori K., Imamaki A., Kii S., Nagamune H., Nagahama M., Tsuji A., Matsuda Y.
Submitted (SEP-1996) to the EMBL/GenBank/DDBJ databases
Cited for: NUCLEOTIDE SEQUENCE (ISOFORM PACE4A-II).
Tissue: Placenta.
[5]"A novel human PACE4 isoform, PACE4E is an active processing protease containing a hydrophobic cluster at the carboxy terminus."
Mori K., Kii S., Tsuji A., Nagahama M., Imamaki A., Hayashi K., Akamatsu T., Nagamune H., Matsuda Y.
J. Biochem. 121:941-948(1997) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [MRNA] (ISOFORMS PACE4E-I AND PACE4E-II).
Tissue: Cerebellum.
[6]"Genomic organization and alternative splicing of human PACE4 (SPC4), kexin-like processing endoprotease."
Tsuji A., Hine C., Tamai Y., Yonemoto K., Mori K., Yoshida S., Bando M., Sakai E., Mori K., Akamatsu T., Matsuda Y.
J. Biochem. 122:438-452(1997) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [GENOMIC DNA] (ISOFORMS PACE4A-I; PACE4A-II; PACE4CS; PACE4D; PACE4E-I AND PACE4E-II).
[7]"Functional analysis of human PACE4-A and PACE4-C isoforms: identification of a new PACE4-CS isoform."
Zhong M., Benjannet S., Lazure C., Munzer S., Seidah N.G.
FEBS Lett. 396:31-36(1996) [PubMed] [Europe PMC] [Abstract]
Cited for: ALTERNATIVE SPLICING (ISOFORM PACE4CS).
[8]"Endoprotease PACE4 is Ca2+-dependent and temperature-sensitive and can partly rescue the phenotype of a furin-deficient cell strain."
Sucic J.F., Moehring J.M., Inocencio N.M., Luchini J.W., Moehring T.J.
Biochem. J. 339:639-647(1999) [PubMed] [Europe PMC] [Abstract]
Cited for: CHARACTERIZATION.
[9]"Biosynthetic processing and quaternary interactions of proprotein convertase SPC4 (PACE4)."
Nagahama M., Taniguchi T., Hashimoto E., Imamaki A., Mori K., Tsuji A., Matsuda Y.
FEBS Lett. 434:155-159(1998) [PubMed] [Europe PMC] [Abstract]
Cited for: PROTEOLYTIC PROCESSING.
+Additional computationally mapped references.

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
M80482 mRNA. Translation: AAA59998.1.
AB001914 Genomic DNA. Translation: BAA21620.1.
AB001914 Genomic DNA. Translation: BAA21621.1.
AB001914 Genomic DNA. Translation: BAA21622.1.
AB001914 Genomic DNA. Translation: BAA21623.1.
AB001914 Genomic DNA. Translation: BAA21624.1.
AB001914 Genomic DNA. Translation: BAA21625.1.
AB001914 Genomic DNA. Translation: BAA21626.1.
AB001914 Genomic DNA. Translation: BAA21627.1.
D28513 mRNA. Translation: BAA05871.1.
D28514 mRNA. Translation: BAA05872.1.
D87995 mRNA. Translation: BAA21793.1.
D87993 mRNA. Translation: BAA21791.1.
D87994 mRNA. Translation: BAA21792.1.
PIRA39490.
B39490.
JC2191.
JC2192.
JC5570.
JC5571.
RefSeqNP_002561.1. NM_002570.3.
NP_612192.1. NM_138319.2.
NP_612193.1. NM_138320.1.
NP_612194.1. NM_138321.1.
NP_612195.1. NM_138322.2.
NP_612196.1. NM_138323.1.
NP_612197.1. NM_138324.1.
NP_612198.2. NM_138325.2.
UniGeneHs.498494.
Hs.665989.

3D structure databases

ProteinModelPortalP29122.
SMRP29122. Positions 70-141, 162-633, 690-940.
ModBaseSearch...
MobiDBSearch...

Protein-protein interaction databases

BioGrid111083. 3 interactions.
DIPDIP-29903N.
IntActP29122. 3 interactions.
MINTMINT-3011239.

Chemistry

BindingDBP29122.
ChEMBLCHEMBL2951.

Protein family/group databases

MEROPSS08.075.

PTM databases

PhosphoSiteP29122.

Polymorphism databases

DMDM129542.

Proteomic databases

PaxDbP29122.
PRIDEP29122.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

EnsemblENST00000348070; ENSP00000305056; ENSG00000140479.
GeneID5046.
KEGGhsa:5046.
UCSCuc002bwy.3. human. [P29122-1]
uc002bxa.2. human. [P29122-7]
uc002bxb.2. human. [P29122-8]
uc002bxc.1. human. [P29122-4]
uc002bxd.1. human. [P29122-5]
uc002bxg.1. human. [P29122-3]
uc010bpe.3. human. [P29122-2]

Organism-specific databases

CTD5046.
GeneCardsGC15M101844.
HGNCHGNC:8569. PCSK6.
HPAHPA004774.
MIM167405. gene.
neXtProtNX_P29122.
PharmGKBPA32895.
GenAtlasSearch...

Phylogenomic databases

eggNOGCOG4935.
HOVERGENHBG008705.
InParanoidP29122.
KOK08672.
PhylomeDBP29122.
TreeFamTF314277.

Enzyme and pathway databases

ReactomeREACT_111045. Developmental Biology.
REACT_111102. Signal Transduction.
SignaLinkP29122.

Gene expression databases

ArrayExpressP29122.
BgeeP29122.
GenevestigatorP29122.

Family and domain databases

Gene3D2.60.120.260. 1 hit.
3.40.50.200. 1 hit.
InterProIPR006212. Furin_repeat.
IPR008979. Galactose-bd-like.
IPR009030. Growth_fac_rcpt_N_dom.
IPR000209. Peptidase_S8/S53_dom.
IPR023827. Peptidase_S8_Asp-AS.
IPR022398. Peptidase_S8_His-AS.
IPR023828. Peptidase_S8_Ser-AS.
IPR015500. Peptidase_S8_subtilisin-rel.
IPR010909. PLAC.
IPR009020. Prot_inh_propept.
IPR002884. PrprotnconvertsP.
[Graphical view]
PANTHERPTHR10795. PTHR10795. 1 hit.
PfamPF01483. P_proprotein. 1 hit.
PF00082. Peptidase_S8. 1 hit.
PF08686. PLAC. 1 hit.
[Graphical view]
PRINTSPR00723. SUBTILISIN.
SMARTSM00261. FU. 5 hits.
[Graphical view]
SUPFAMSSF49785. SSF49785. 1 hit.
SSF52743. SSF52743. 1 hit.
SSF54897. SSF54897. 1 hit.
SSF57184. SSF57184. 2 hits.
PROSITEPS50900. PLAC. 1 hit.
PS00136. SUBTILASE_ASP. 1 hit.
PS00137. SUBTILASE_HIS. 1 hit.
PS00138. SUBTILASE_SER. 1 hit.
[Graphical view]
ProtoNetSearch...

Other

GeneWikiPCSK6.
GenomeRNAi5046.
NextBio19426.
PROP29122.
SOURCESearch...

Entry information

Entry namePCSK6_HUMAN
AccessionPrimary (citable) accession number: P29122
Secondary accession number(s): Q15099 expand/collapse secondary AC list , Q15100, Q9UEG7, Q9UEJ1, Q9UEJ2, Q9UEJ7, Q9UEJ8, Q9UEJ9, Q9Y4G9, Q9Y4H0, Q9Y4H1
Entry history
Integrated into UniProtKB/Swiss-Prot: December 1, 1992
Last sequence update: December 1, 1992
Last modified: April 16, 2014
This is version 147 of the entry and version 1 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Relevant documents

SIMILARITY comments

Index of protein domains and families

Peptidase families

Classification of peptidase families and list of entries

MIM cross-references

Online Mendelian Inheritance in Man (MIM) cross-references in UniProtKB/Swiss-Prot

Human polymorphisms and disease mutations

Index of human polymorphisms and disease mutations

Human entries with polymorphisms or disease mutations

List of human entries with polymorphisms or disease mutations

Human chromosome 15

Human chromosome 15: entries, gene names and cross-references to MIM