Skip Header

You are using a version of Internet Explorer that may not display all features of this website. Please upgrade to a modern browser.
Contribute Send feedback
Read comments (?) or add your own

Q9C0A1 (ZFHX2_HUMAN) Reviewed, UniProtKB/Swiss-Prot

Last modified April 16, 2014. Version 110. Feed History...

Clusters with 100%, 90%, 50% identity | Documents (2) | Third-party data text xml rdf/xml gff fasta
to top of pageNames·Attributes·General annotation·Ontologies·Alt products·Sequence annotation·Sequences·References·Cross-refs·Entry info·DocumentsCustomize order

Names and origin

Protein namesRecommended name:
Zinc finger homeobox protein 2
Alternative name(s):
Zinc finger homeodomain protein 2
Short name=ZFH-2
Gene names
Name:ZFHX2
Synonyms:KIAA1056, KIAA1762, ZNF409
OrganismHomo sapiens (Human) [Reference proteome]
Taxonomic identifier9606 [NCBI]
Taxonomic lineageEukaryotaMetazoaChordataCraniataVertebrataEuteleostomiMammaliaEutheriaEuarchontogliresPrimatesHaplorrhiniCatarrhiniHominidaeHomo

Protein attributes

Sequence length2572 AA.
Sequence statusComplete.
Protein existenceEvidence at transcript level

General annotation (Comments)

Function

May be involved in transcriptional regulation.

Subcellular location

Nucleus Probable.

Sequence similarities

Contains 13 C2H2-type zinc fingers.

Contains 3 homeobox DNA-binding domains.

Sequence caution

The sequence BAA83008.2 differs from that shown. Reason: Erroneous initiation. Translation N-terminally shortened.

Ontologies

Alternative products

This entry describes 2 isoforms produced by alternative splicing. [Align] [Select]
Isoform 1 (identifier: Q9C0A1-1)

This isoform has been chosen as the 'canonical' sequence. All positional information in this entry refers to it. This is also the sequence that appears in the downloadable versions of the entry.
Isoform 2 (identifier: Q9C0A1-2)

The sequence of this isoform differs from the canonical sequence as follows:
     854-862: FLLDMEGAE → RTETGLLIK
     863-2572: Missing.

Sequence annotation (Features)

Feature keyPosition(s)LengthDescriptionGraphical viewFeature identifier

Molecule processing

Chain1 – 25722572Zinc finger homeobox protein 2
PRO_0000047243

Regions

Zinc finger821 – 84525C2H2-type 4
Zinc finger870 – 89425C2H2-type 5
Zinc finger1009 – 103224C2H2-type 6
Zinc finger1191 – 121727C2H2-type 7
Zinc finger1248 – 127225C2H2-type 8
Zinc finger1480 – 150324C2H2-type 9
DNA binding1595 – 165460Homeobox 1
Zinc finger1670 – 169627C2H2-type 10; degenerate
Zinc finger1769 – 179123C2H2-type 11
DNA binding1857 – 191660Homeobox 2
DNA binding2065 – 212460Homeobox 3
Zinc finger2451 – 247222C2H2-type 12; degenerate
Zinc finger2495 – 251925C2H2-type 13
Compositional bias605 – 710106Pro-rich
Compositional bias1061 – 114484Pro-rich
Compositional bias1321 – 1471151Pro-rich
Compositional bias1699 – 175961Glu-rich
Compositional bias1921 – 201898Pro-rich
Compositional bias2193 – 2424232Pro-rich

Natural variations

Alternative sequence854 – 8629FLLDMEGAE → RTETGLLIK in isoform 2.
VSP_039496
Alternative sequence863 – 25721710Missing in isoform 2.
VSP_039497

Experimental info

Sequence conflict4731R → Q in BAA83008. Ref.1
Sequence conflict5501P → T in BAA83008. Ref.1

Sequences

Sequence LengthMass (Da)Tools
Isoform 1 [UniParc].

Last modified July 13, 2010. Version 3.
Checksum: 239C8050F65B25C2

FASTA2,572274,176
        10         20         30         40         50         60 
MATLNSASTT GTTPSPGHNA PSLPSDTFSS STPSDPVTKD PPAASSTSEN MRSSEPGGQL 

        70         80         90        100        110        120 
LESGCGLVPP KEIGEPQEGP DCGHFPPNDP GVEKDKEQEE EEEGLPPMDL SNHLFFTAGG 

       130        140        150        160        170        180 
EAYLVAKLSL PGGSELLLPK GFPWGEAGIK EEPSLPFLAY PPPSHLTALH IQHGFDPIQG 

       190        200        210        220        230        240 
FSSSDQILSH DTSAPSPAAC EERHGAFWSY QLAPNPPGDP KDGPMGNSGG NHVAVFWLCL 

       250        260        270        280        290        300 
LCRLGFSKPQ AFMDHTQSHG VKLTPAQYQG LSGSPAVLQE GDEGCKALIS FLEPKLPARP 

       310        320        330        340        350        360 
SSDIPLDNSS TVNMEANVAQ TEDGPPEAEV QALILLDEEV MALSPPSPPT ATWDPSPTQA 

       370        380        390        400        410        420 
KESPVAAGEA GPDWFPEGQE EDGGLCPPLN QSSPTSKEGG TLPAPVGSPE DPSDPPQPYR 

       430        440        450        460        470        480 
LADDYTPAPA AFQGLSLSSH MSLLHSRNSC KTLKCPKCNW HYKYQQTLDV HMREKHPESN 

       490        500        510        520        530        540 
SHCSYCSAGG AHPRLARGES YNCGYKPYRC DVCNYSTTTK GNLSIHMQSD KHLANLQGFQ 

       550        560        570        580        590        600 
AGPGGQGSPP EASLPPSAGD KEPKTKSSWQ CKVCSYETNI SRNLRIHMTS EKHMQNVLML 

       610        620        630        640        650        660 
HQGLPLGLPP GLMGPGPPPP PGATPTSPPE LFQYFGPQAL GQPQTPLAGP GLRPDKPLEA 

       670        680        690        700        710        720 
QLLLNGFHHV GAPARKFPTS APGSLSPDAH LPPSQLLGSS SDSLPTSPPP DDSLSLKVFR 

       730        740        750        760        770        780 
CLVCQAFSTD SLELLLYHCS IGRSLPEAEW KEVAGDTHRC KLCCYGTQLK ANFQLHLKTD 

       790        800        810        820        830        840 
KHAQKYQLAA HLREGGGAMG TPSPASLGDG APYGSVSPLH LRCNICDFES NSKEKMQLHA 

       850        860        870        880        890        900 
RGAAHEENSQ IYKFLLDMEG AEAGAELGLY HCLLCAWETP SRLAVLQHLR TPAHRDAQAQ 

       910        920        930        940        950        960 
RRLQLLQNGP TTEEGLAALQ SILSFSHGQL RTPGKAPVTP LAEPPTPEKD AQNKTEQLAS 

       970        980        990       1000       1010       1020 
EETENKTGPS RDSANQTTVY CCPYCSFLSP ESSQVRAHTL SQHAVQPKYR CPLCQEQLVG 

      1030       1040       1050       1060       1070       1080 
RPALHFHLSH LHNVVPECVE KLLLVATTVE MTFTTKVLSA PTLSPLDNGQ EPPTHGPEPT 

      1090       1100       1110       1120       1130       1140 
PSRDQAAEGP NLTPEASPDP LPEPPLASVE VPDKPSGSPG QPPSPAPSPV PEPDAQAEDV 

      1150       1160       1170       1180       1190       1200 
APPPTMAEEE EGTTGELRSA EPAPADSRHP LTYRKTTNFA LDKFLDPARP YKCTVCKESF 

      1210       1220       1230       1240       1250       1260 
TQKNILLVHY NSVSHLHKMK KAAIDPSAPA RGEAGAPPTT TAATDKPFKC TVCRVSYNQS 

      1270       1280       1290       1300       1310       1320 
STLEIHMRSV LHQTRSRGTK TDSKIEGPER SQEEPKEGET EGEVGTEKKG PDTSGFISGL 

      1330       1340       1350       1360       1370       1380 
PFLSPPPPPL DLHRFPAPLF TPPVLPPFPL VPESLLKLQQ QQLLLPFYLH DLKVGPKLTL 

      1390       1400       1410       1420       1430       1440 
AGPAPVLSLP AATPPPPPQP PKAELAEREW ERPPMAKEGN EAGPSSPPDP LPNEAARTAA 

      1450       1460       1470       1480       1490       1500 
KALLENFGFE LVIQYNEGKQ AVPPPPTPPP PEALGGGDKL ACGACGKLFS NMLILKTHEE 

      1510       1520       1530       1540       1550       1560 
HVHRRFLPFE ALSRYAAQFR KSYDSLYPPL AEPPKPPDGS LDSPVPHLGP PFLVPEPEAG 

      1570       1580       1590       1600       1610       1620 
GTRAPEERSR AGGHWPIEEE ESSRGNLPPL VPAGRRFSRT KFTEFQTQAL QSFFETSAYP 

      1630       1640       1650       1660       1670       1680 
KDGEVERLAS LLGLASRVVV VWFQNARQKA RKNACEGGSM PTGGGTGGAS GCRRCHATFS 

      1690       1700       1710       1720       1730       1740 
CVFELVRHLK KCYDDQTLEE EEEEAERGEE EEEVEEEEVE EEQGLEPPAG PEGPLPEPPD 

      1750       1760       1770       1780       1790       1800 
GEELSQAEAT KAGGKEPEEK ATPSPSPAHT CDQCAISFSS QDLLTSHRRL HFLPSLQPSA 

      1810       1820       1830       1840       1850       1860 
PPQLLDLPLL VFGERNPLVA ATSPMPGPPL KRKHEDGSLS PTGSEAGGGG EGEPPRDKRL 

      1870       1880       1890       1900       1910       1920 
RTTILPEQLE ILYRWYMQDS NPTRKMLDCI SEEVGLKKRV VQVWFQNTRA RERKGQFRST 

      1930       1940       1950       1960       1970       1980 
PGGVPSPAVK PPATATPASL PKFNLLLGKV DDGTGREAPK REAPAFPYPT ATLASGPQPF 

      1990       2000       2010       2020       2030       2040 
LPPGKEATTP TPEPPLPLLP PPPPSEEEGP EEPPKASPES EACSLSAGDL SDSSASSLAE 

      2050       2060       2070       2080       2090       2100 
PESPGAGGTS GGPGGGTGVP DGMGQRRYRT QMSSLQLKIM KACYEAYRTP TMQECEVLGE 

      2110       2120       2130       2140       2150       2160 
EIGLPKRVIQ VWFQNARAKE KKAKLQGTAA GSTGGSSEGL LAAQRTDCPY CDVKYDFYVS 

      2170       2180       2190       2200       2210       2220 
CRGHLFSRQH LAKLKEAVRA QLKSESKCYD LAPAPEAPPA LKAPPATTPA SMPLGAAPTL 

      2230       2240       2250       2260       2270       2280 
PRLAPVLLSG PALAQPPLGN LAPFNSGPAA SSGLLGLATS VLPTTTVVQT AGPGRPLPQR 

      2290       2300       2310       2320       2330       2340 
PMPDQTNTST AGTTDPVPGP PTEPLGDKVS SERKPVAGPT SSSNDALKNL KALKTTVPAL 

      2350       2360       2370       2380       2390       2400 
LGGQFLPFPL PPAGGTAPPA VFGPQLQGAY FQQLYGMKKG LFPMNPMIPQ TLIGLLPNAL 

      2410       2420       2430       2440       2450       2460 
LQPPPQPPEP TATAPPKPPE LPAPGEGEAG EVDELLTGST GISTVDVTHR YLCRQCKMAF 

      2470       2480       2490       2500       2510       2520 
DGEAPATAHQ RSFCFFGRGS GGSMPPPLRV PICTYHCLAC EVLLSGREAL ASHLRSSAHR 

      2530       2540       2550       2560       2570 
RKAAPPQGGP PISITNAATA ASAAVAFAKE EARLPHTDSN PKTTTTSTLL AL 

« Hide

Isoform 2 [UniParc].

Checksum: 4F3A14A941E610B7
Show »

FASTA86291,826

References

[1]"Prediction of the coding sequences of unidentified human genes. XIX. The complete sequences of 100 new cDNA clones from brain which code for large proteins in vitro."
Nagase T., Kikuno R., Hattori A., Kondo Y., Okumura K., Ohara O.
DNA Res. 7:347-355(2000) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2), NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] OF 1113-2572 (ISOFORM 1).
Tissue: Brain.
[2]"The DNA sequence and analysis of human chromosome 14."
Heilig R., Eckenberg R., Petit J.-L., Fonknechten N., Da Silva C., Cattolico L., Levy M., Barbe V., De Berardinis V., Ureta-Vidal A., Pelletier E., Vico V., Anthouard V., Rowen L., Madan A., Qin S., Sun H., Du H. expand/collapse author list , Pepin K., Artiguenave F., Robert C., Cruaud C., Bruels T., Jaillon O., Friedlander L., Samson G., Brottier P., Cure S., Segurens B., Aniere F., Samain S., Crespeau H., Abbasi N., Aiach N., Boscus D., Dickhoff R., Dors M., Dubois I., Friedman C., Gouyvenoux M., James R., Madan A., Mairey-Estrada B., Mangenot S., Martins N., Menard M., Oztas S., Ratcliffe A., Shaffer T., Trask B., Vacherie B., Bellemere C., Belser C., Besnard-Gonnet M., Bartol-Mavel D., Boutard M., Briez-Silla S., Combette S., Dufosse-Laurent V., Ferron C., Lechaplais C., Louesse C., Muselet D., Magdelenat G., Pateau E., Petit E., Sirvain-Trukniewicz P., Trybou A., Vega-Czarny N., Bataille E., Bluet E., Bordelais I., Dubois M., Dumont C., Guerin T., Haffray S., Hammadi R., Muanga J., Pellouin V., Robert D., Wunderle E., Gauguet G., Roy A., Sainte-Marthe L., Verdier J., Verdier-Discala C., Hillier L.W., Fulton L., McPherson J., Matsuda F., Wilson R., Scarpelli C., Gyapay G., Wincker P., Saurin W., Quetier F., Waterston R., Hood L., Weissenbach J.
Nature 421:601-607(2003) [PubMed] [Europe PMC] [Abstract]
Cited for: NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA].

Cross-references

Sequence databases

EMBL
GenBank
DDBJ
AB028979 mRNA. Translation: BAA83008.2. Different initiation.
AB051549 mRNA. Translation: BAB21853.1.
AL132855 Genomic DNA. No translation available.
AL135999 Genomic DNA. No translation available.
RefSeqNP_207646.2. NM_033400.2.
XP_005268191.1. XM_005268134.1.
UniGeneHs.508937.

3D structure databases

ProteinModelPortalQ9C0A1.
SMRQ9C0A1. Positions 433-590, 980-1027, 1188-1269, 1593-1653, 1853-1914, 2063-2122, 2135-2187.
ModBaseSearch...
MobiDBSearch...

Protein-protein interaction databases

BioGrid124533. 1 interaction.

PTM databases

PhosphoSiteQ9C0A1.

Polymorphism databases

DMDM300669698.

Proteomic databases

PaxDbQ9C0A1.
PRIDEQ9C0A1.

Protocols and materials databases

StructuralBiologyKnowledgebaseSearch...

Genome annotation databases

EnsemblENST00000419474; ENSP00000413418; ENSG00000136367. [Q9C0A1-1]
GeneID85446.
KEGGhsa:85446.
UCSCuc010akq.3. human. [Q9C0A1-1]

Organism-specific databases

CTD85446.
GeneCardsGC14M023990.
H-InvDBHIX0011544.
HIX0172362.
HGNCHGNC:20152. ZFHX2.
HPAHPA000720.
HPA005146.
neXtProtNX_Q9C0A1.
HUGESearch...
Search...
GenAtlasSearch...

Phylogenomic databases

eggNOGNOG272646.
HOGENOMHOG000128272.
HOVERGENHBG063719.
InParanoidQ9UPU6.
KOK09379.
OMAPLCQEQL.
OrthoDBEOG7FR7FH.
PhylomeDBQ9C0A1.
TreeFamTF323288.

Gene expression databases

ArrayExpressQ9C0A1.
BgeeQ9C0A1.
CleanExHS_ZFHX2.
GenevestigatorQ9C0A1.

Family and domain databases

Gene3D1.10.10.60. 3 hits.
3.30.160.60. 2 hits.
InterProIPR017970. Homeobox_CS.
IPR001356. Homeobox_dom.
IPR009057. Homeodomain-like.
IPR027028. ZFHX2.
IPR007087. Znf_C2H2.
IPR015880. Znf_C2H2-like.
IPR013087. Znf_C2H2/integrase_DNA-bd.
IPR003604. Znf_U1.
[Graphical view]
PANTHERPTHR24208:SF42. PTHR24208:SF42. 1 hit.
PfamPF00046. Homeobox. 3 hits.
[Graphical view]
SMARTSM00389. HOX. 3 hits.
SM00355. ZnF_C2H2. 15 hits.
SM00451. ZnF_U1. 7 hits.
[Graphical view]
SUPFAMSSF46689. SSF46689. 3 hits.
PROSITEPS00027. HOMEOBOX_1. 1 hit.
PS50071. HOMEOBOX_2. 3 hits.
PS00028. ZINC_FINGER_C2H2_1. 9 hits.
PS50157. ZINC_FINGER_C2H2_2. 5 hits.
[Graphical view]
ProtoNetSearch...

Other

GenomeRNAi85446.
NextBio76049.
PROQ9C0A1.

Entry information

Entry nameZFHX2_HUMAN
AccessionPrimary (citable) accession number: Q9C0A1
Secondary accession number(s): Q9UPU6
Entry history
Integrated into UniProtKB/Swiss-Prot: February 12, 2003
Last sequence update: July 13, 2010
Last modified: April 16, 2014
This is version 110 of the entry and version 3 of the sequence. [Complete history]
Entry statusReviewed (UniProtKB/Swiss-Prot)
Annotation programChordata Protein Annotation Program
DisclaimerAny medical or genetic information present in this entry is provided for research, educational and informational purposes only. It is not in any way intended to be used as a substitute for professional medical advice, diagnosis, treatment or care.

Relevant documents

SIMILARITY comments

Index of protein domains and families

Human chromosome 14

Human chromosome 14: entries, gene names and cross-references to MIM