ID WFDC2_HUMAN Reviewed; 124 AA. AC Q14508; Q6IB27; Q8WXV9; Q8WXW0; Q8WXW1; Q8WXW2; Q96KJ1; DT 15-JUL-1998, integrated into UniProtKB/Swiss-Prot. DT 23-JAN-2002, sequence version 2. DT 22-JUL-2008, entry version 85. DE RecName: Full=WAP four-disulfide core domain protein 2; DE AltName: Full=Major epididymis-specific protein E4; DE AltName: Full=Epididymal secretory protein E4; DE AltName: Full=Putative protease inhibitor WAP5; DE Flags: Precursor; GN Name=WFDC2; Synonyms=HE4, WAP5; OS Homo sapiens (Human). OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; OC Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; OC Catarrhini; Hominidae; Homo. OX NCBI_TaxID=9606; RN [1] RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1). RC TISSUE=Epididymis; RX MEDLINE=92153963; PubMed=1686187; RA Kirchhoff C., Habben L., Ivell R., Krull N.; RT "A major human epididymis-specific cDNA encodes a protein with RT sequence homology to extracellular proteinase inhibitors."; RL Biol. Reprod. 45:350-357(1991). RN [2] RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORMS 1; 2; 3; 4 AND 5). RX MEDLINE=21962329; PubMed=11965550; DOI=10.1038/sj.onc.1205363; RA Bingle L., Singleton V., Bingle C.D.; RT "The putative ovarian tumour marker gene HE4 (WFDC2), is expressed in RT normal tissues and undergoes complex alternative splicing to yield RT multiple protein isoforms."; RL Oncogene 21:2768-2773(2002). RN [3] RP NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1). RX PubMed=12839961; RA Hellstrom I., Raycraft J., Hayden-Ledbetter M., Ledbetter J.A., RA Schummer M., McIntosh M., Drescher C., Urban N., Hellstrom K.E.; RT "The HE4 (WFDC2) protein is a biomarker for ovarian carcinoma."; RL Cancer Res. 63:3695-3700(2003). RN [4] RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1). RA Ebert L., Schick M., Neubert P., Schatten R., Henze S., Korn B.; RT "Cloning of human full open reading frames in Gateway(TM) system entry RT vector (pDONR201)."; RL Submitted (JUN-2004) to the EMBL/GenBank/DDBJ databases. RN [5] RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]. RX MEDLINE=21638749; PubMed=11780052; DOI=10.1038/414865a; RA Deloukas P., Matthews L.H., Ashurst J.L., Burton J., Gilbert J.G.R., RA Jones M., Stavrides G., Almeida J.P., Babbage A.K., Bagguley C.L., RA Bailey J., Barlow K.F., Bates K.N., Beard L.M., Beare D.M., RA Beasley O.P., Bird C.P., Blakey S.E., Bridgeman A.M., Brown A.J., RA Buck D., Burrill W.D., Butler A.P., Carder C., Carter N.P., RA Chapman J.C., Clamp M., Clark G., Clark L.N., Clark S.Y., Clee C.M., RA Clegg S., Cobley V.E., Collier R.E., Connor R.E., Corby N.R., RA Coulson A., Coville G.J., Deadman R., Dhami P.D., Dunn M., RA Ellington A.G., Frankland J.A., Fraser A., French L., Garner P., RA Grafham D.V., Griffiths C., Griffiths M.N.D., Gwilliam R., Hall R.E., RA Hammond S., Harley J.L., Heath P.D., Ho S., Holden J.L., Howden P.J., RA Huckle E., Hunt A.R., Hunt S.E., Jekosch K., Johnson C.M., Johnson D., RA Kay M.P., Kimberley A.M., King A., Knights A., Laird G.K., Lawlor S., RA Lehvaeslaiho M.H., Leversha M.A., Lloyd C., Lloyd D.M., Lovell J.D., RA Marsh V.L., Martin S.L., McConnachie L.J., McLay K., McMurray A.A., RA Milne S.A., Mistry D., Moore M.J.F., Mullikin J.C., Nickerson T., RA Oliver K., Parker A., Patel R., Pearce T.A.V., Peck A.I., RA Phillimore B.J.C.T., Prathalingam S.R., Plumb R.W., Ramsay H., RA Rice C.M., Ross M.T., Scott C.E., Sehra H.K., Shownkeen R., Sims S., RA Skuce C.D., Smith M.L., Soderlund C., Steward C.A., Sulston J.E., RA Swann R.M., Sycamore N., Taylor R., Tee L., Thomas D.W., Thorpe A., RA Tracey A., Tromans A.C., Vaudin M., Wall M., Wallis J.M., RA Whitehead S.L., Whittaker P., Willey D.L., Williams L., Williams S.A., RA Wilming L., Wray P.W., Hubbard T., Durbin R.M., Bentley D.R., Beck S., RA Rogers J.; RT "The DNA sequence and comparative analysis of human chromosome 20."; RL Nature 414:865-871(2001). RN [6] RP NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 1). RC TISSUE=Colon; RX PubMed=15489334; DOI=10.1101/gr.2596504; RG The MGC Project Team; RT "The status, quality, and expansion of the NIH full-length cDNA RT project: the Mammalian Gene Collection (MGC)."; RL Genome Res. 14:2121-2127(2004). RN [7] RP SUBCELLULAR LOCATION, AND TISSUE SPECIFICITY. RX PubMed=15781627; DOI=10.1158/0008-5472.CAN-04-3924; RA Drapkin R., von Horsten H.H., Lin Y., Mok S.C., Crum C.P., Welch W.R., RA Hecht J.L.; RT "Human epididymis protein 4 (HE4) is a secreted glycoprotein that is RT overexpressed by serous and endometrioid ovarian carcinomas."; RL Cancer Res. 65:2162-2169(2005). RN [8] RP GLYCOSYLATION [LARGE SCALE ANALYSIS] AT ASN-44, AND MASS SPECTROMETRY. RC TISSUE=Saliva; RX PubMed=16740002; DOI=10.1021/pr050492k; RA Ramachandran P., Boontheung P., Xie Y., Sondej M., Wong D.T., RA Loo J.A.; RT "Identification of N-linked glycoproteins in human saliva by RT glycoprotein capture and mass spectrometry."; RL J. Proteome Res. 5:1493-1503(2006). CC -!- SUBCELLULAR LOCATION: Secreted. CC -!- ALTERNATIVE PRODUCTS: CC Event=Alternative splicing; Named isoforms=5; CC Comment=Additional isoforms seem to exist; CC Name=1; CC IsoId=Q14508-1; Sequence=Displayed; CC Name=2; Synonyms=HE4-V3; CC IsoId=Q14508-2; Sequence=VSP_007666, VSP_007667; CC Name=3; Synonyms=HE4-V2; CC IsoId=Q14508-3; Sequence=VSP_007668; CC Name=4; Synonyms=HE4-V1; CC IsoId=Q14508-4; Sequence=VSP_007669, VSP_007671; CC Name=5; Synonyms=HE4-V4; CC IsoId=Q14508-5; Sequence=VSP_007670, VSP_007672; CC -!- TISSUE SPECIFICITY: Expressed in a number of normal tissues, CC including male reproductive system, regions of the respiratory CC tract and nasopharynx. Highly expressed in a number of tumors CC cells lines, such ovarian, colon, breast, lung and renal cells CC lines. Initially described as being exclusively transcribed in the CC epididymis. CC -!- SIMILARITY: Contains 2 WAP domains. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; X63187; CAA44869.1; -; mRNA. DR EMBL; AF330259; AAL37485.1; -; mRNA. DR EMBL; AF330260; AAL37486.1; -; mRNA. DR EMBL; AF330261; AAL37487.1; -; mRNA. DR EMBL; AF330262; AAL37488.1; -; mRNA. DR EMBL; AY212888; AAO52683.1; -; mRNA. DR EMBL; CR456977; CAG33258.1; -; mRNA. DR EMBL; AL031663; CAB37641.1; -; Genomic_DNA. DR EMBL; BC046106; AAH46106.1; -; mRNA. DR PIR; S25454; S25454. DR RefSeq; NP_006094.3; -. DR UniGene; Hs.2719; -. DR HSSP; Q9N0L8; 1TWP. DR PeptideAtlas; Q14508; -. DR Ensembl; ENSG00000101443; Homo sapiens. DR GeneID; 10406; -. DR KEGG; hsa:10406; -. DR HGNC; HGNC:15939; WFDC2. DR PharmGKB; PA38059; -. DR HOVERGEN; Q14508; -. DR ArrayExpress; Q14508; -. DR CleanEx; HS_WFDC2; -. DR GermOnline; ENSG00000101443; Homo sapiens. DR GO; GO:0005615; C:extracellular space; TAS:ProtInc. DR GO; GO:0004866; F:endopeptidase inhibitor activity; TAS:ProtInc. DR GO; GO:0006508; P:proteolysis; TAS:ProtInc. DR GO; GO:0007283; P:spermatogenesis; TAS:ProtInc. DR InterPro; IPR015874; 4-disulphide_core. DR InterPro; IPR008197; Whey_acidic_protein_4-diS_core. DR Gene3D; G3DSA:4.10.75.10; Whey_acidic_protein_4-diS_core; 2. DR Pfam; PF00095; WAP; 2. DR PRINTS; PR00003; 4DISULPHCORE. DR ProDom; PD001224; Prot_inh_I17; 1. DR SMART; SM00217; WAP; 2. DR PROSITE; PS00317; 4_DISULFIDE_CORE; 2. PE 1: Evidence at protein level; KW Alternative splicing; Glycoprotein; Protease inhibitor; Repeat; KW Secreted; Serine protease inhibitor; Signal. FT SIGNAL 1 30 Potential. FT CHAIN 31 124 WAP four-disulfide core domain protein 2. FT /FTId=PRO_0000041370. FT DOMAIN 32 74 WAP 1. FT DOMAIN 76 124 WAP 2. FT CARBOHYD 44 44 N-linked (GlcNAc...). FT DISULFID 36 62 By similarity. FT DISULFID 45 66 By similarity. FT DISULFID 49 61 By similarity. FT DISULFID 55 70 By similarity. FT DISULFID 80 110 By similarity. FT DISULFID 93 114 By similarity. FT DISULFID 97 109 By similarity. FT DISULFID 103 119 By similarity. FT VAR_SEQ 2 23 PACRLGPLAAALLLSLLLFGFT -> LQVQVNLPVSPLPTY FT PYSFFYP (in isoform 2). FT /FTId=VSP_007666. FT VAR_SEQ 24 74 Missing (in isoform 2). FT /FTId=VSP_007667. FT VAR_SEQ 27 74 Missing (in isoform 3). FT /FTId=VSP_007668. FT VAR_SEQ 71 79 SLPNDKEGS -> LLCPNGQLAE (in isoform 4). FT /FTId=VSP_007669. FT VAR_SEQ 75 102 DKEGSCPQVNINFPQLGLCRDQCQVDSQ -> ALFHWHLKT FT RRLWEISGPRPRRPTWDSS (in isoform 5). FT /FTId=VSP_007670. FT VAR_SEQ 80 124 Missing (in isoform 4). FT /FTId=VSP_007671. FT VAR_SEQ 103 124 Missing (in isoform 5). FT /FTId=VSP_007672. FT CONFLICT 71 72 SL -> LLC (in Ref. 1; CAA44869 and 2; FT AAL37485). FT CONFLICT 101 101 S -> T (in Ref. 1; CAA44869). SQ SEQUENCE 124 AA; 12993 MW; 9536B00B385259AD CRC64; MPACRLGPLA AALLLSLLLF GFTLVSGTGA EKTGVCPELQ ADQNCTQECV SDSECADNLK CCSAGCATFC SLPNDKEGSC PQVNINFPQL GLCRDQCQVD SQCPGQMKCC RNGCGKVSCV TPNF //