| 1 | Vaiman D, Pietrokovski S, Cohen B, Benech P, Chebath J |
Synergism of type I and type II interferons in stimulating the activity of the same DNA enhancer |
FEBS Lett 265:12-16, 1990
(Medline ID:
90306300)
|
| Abstract |
| Type I and type II interferons (IFNs) can act synergistically to activate the transcription of the 2-5A synthetase gene. We used in vivo functional assays of sequences from the gene promoter region to determine which DNA segment mediates the gene induction by IFN gamma and the synergistic effect. We found that the type I IFN-inducible enhancer (or IRS) of the 2-5A synthetase gene also confers inducibility by type II IFN to a reporter CAT gene, though the time course and dose response of the induction by the two IFNs are quite different. A clear synergism of the two IFN in stimulating the IRS is observed at low doses of the two IFNs. |
| 2 | Pietrokovski S, Hirshon J, Trifonov EN |
| Linguistic measure of taxonomic and functional relatedness of nucleotide sequences |
| J Biomol Struct Dyn 7:1251-1268, 1990
(Medline ID:
2363847)
|
| Abstract |
| The frequencies of "words", oligonucleotides within nucleotide sequences, reflect the genetic information contained in the sequence "texts". Nucleotide sequences are characteristically represented by their contrast word vocabularies. Comparison of the sequences by correlating their contrast vocabularies is shown to reflect well the relatedness (unrelatedness) between the sequences. A single value, the linguistic similarity between the sequences, is suggested as a measure of sequence relatedness. Sequences as short as 1000 bases can be characterized and quantitatively related to other sequences by this technique. The linguistic sequence similarity value is used for analysis of taxonomically and functionally diverse nucleotide sequences. The similarity value is shown to be very sensitive to the relatedness of the source species, thus providing a convenient tool for taxonomic classification of species by their sequence vocabularies. Functionally diverse sequences appear distinct by their linguistic similarity values. This can be a basis for a quick screening technique for functional characterization of the sequences and for mapping functionally distinct regions in long sequences. |
| 3 | Pietrokovski S, Trifonov EN |
| Imported sequences in the mitochondrial yeast genome identified by nucleotide linguistics |
| Gene 122:129-137, 1992
(Medline ID:
1452019)
|
| Abstract |
| In addition to universally appearing mitochondrial (mt) genes, origins of replication and transcription start regions typical of all mt genome variants of the yeast Saccharomyces cerevisiae, the mt genomes of some of the strains contain variable sequences. These sequences are apparently largely dispensable. They are mainly composed of group-I and -II introns and intergenic open reading frames (ORFs). Many of the introns contain ORFs, some of which were shown by genetic and biochemical means to be involved in splicing and transposition of the mt introns. Some of the optional sequences are hypothesized to be mobile genetic elements. Nucleotide (nt) sequences of the mt genome of S. cerevisiae were examined by analyzing occurrences of oligodeoxyribonucleotide (oligo) 'words'. This linguistic technique had been found to be sensitive to both function and origin of the sequence [Pietrokovski et al., J. Biomol. Struct. Dyn. 7 (1990) 1251-1268]. A clear difference is found between the oligo vocabularies of the optional and basic yeast mt sequences. The difference is mainly located in protein coding segments of the optional sequences which contain conserved amino acid motifs, characteristic of intronic and intergenic ORFs. The use of nt linguistics to detect the sequence dissimilarity and its causes in yeast mitochondria provides fast and straightforward results, identifying the intronic and intergenic ORFs as DNA sequences of foreign, non-mt origin. |
| 6 | Henikoff S, Henikoff JG, Alford WJ, Pietrokovski S |
| Automated construction and graphical presentation of protein blocks from unaligned sequences |
|
Gene 163:GC17-GC26, 1995
(Medline ID:
7590261)
|
|
| Abstract |
| Protein blocks consist of multiply aligned sequence segments that correspond to the most highly conserved regions of protein families. Typically, a set of related proteins has more than one region in common and their relationship can be represented as a series of ungapped blocks separated by unaligned regions. Blockmaker is an automated system available by electronic mail (blockmaker@howard.fhcrc.org) and the World Wide Web (http://www.blocks.fhcrc.org) that finds blocks in a group of related protein sequences submitted by the user. It adapts and extends existing algorithms to make them useful to biologists looking for conserved regions in a group of related proteins sequences. Two sets of blocks are returned, one in which candidate blocks are detected using the MOTIF algorithm and the other using a Gibbs sampler algorithm that has been adapted for full automation. This use of two block-finding methods based on completely different principles provides a 'reality check,' whereby a block detected by both methods is considered to be correct. Resulting blocks can be displayed using the information-based 'sequence logo' method, adapted to incorporate sequence weights, which provides an intuitive visual description of both the residue and the conservation information at each position. Blocks generated by this system are useful in diverse applications, such as searching databases and designing degenerate PCR primers. As an example, blocks made from amino acid sequences related to Caenorhabditis elegans Tc1 transposase were used to search GenBank, revealing that several fish and amphibian genomic sequences harbor previously unreported Tc1 homologs. |
| 13 |
Smith DR, Doucette-Stamm LA, Deloughery C, Lee H, Dubois J,
Aldredge T, Bashirzadeh R, Blakely D, Cook R, Gilbert K,
Harrison D, Hoang L, Keagle P, Lumm W, Pothier B, Qiu D,
Spadafora R, Vicaire R, Wang Y, Wierzbowski J, Gibson R,
Jiwani N, Caruso A, Bush D, Safer H, Patwell D, Prabhakar S,
McDougall S, Shimer G, Goyal A, Pietrokovski S, Church GM,
Daniels CJ, Mao J, Rice P, Nolling J, Reeve JN |
| Complete genome sequence of Methanobacterium thermoautotrophicum
strain deltaH: Functional analysis and comparative genomics |
|
J Bacteriology 179:7135-7155, 1997
(Medline ID:
9371463) |
| Abstract |
|
The complete 1,751,377 bp sequence of the genome of the
thermophilic archaeon Methanobacterium thermoautotrophicum
strain DH has been determined by a whole genome shotgun
sequencing approach. 1,855 open reading frames (ORFs) have been
identified that appear likely to encode polypeptides, 807 (44%) of
which have been assigned putative functions based on their
similarities to database sequences with assigned functions. 547
(29%) of the ORF encoded amino acid sequences are related to
database sequences with unknown functions, and 501 (27%) have
little or no homology to database sequences. Comparisons with
eucaryal, bacterial and archaeal specific databases reveal that
1,013 of the putative ORF-encoded gene product (54% of the total)
have sequences most similar to polypeptide sequences described
previously in other Archaea, and 210 (11%) have sequences with
significant similarity only to archaeal polypeptides. Comparisons
with the Methanococcus jannaschii genome data underline the
extensive divergence that has occurred between the two
methanogens. Only 352 (19%) of M. thermoautotrophicum ORFs encode
sequences that are >50% identical to M. jannaschii ORF-encoded
sequences, and only 14 (<1%) polypeptides are predicted to have
sequences that are >70% identical in the two methanogens. There is
also little conservation in the relative locations of orthologous
genes within the two methanogen genomes. When the M.
thermoautotrophicum ORF-encoded sequences are evaluated in terms
of their similarity to bacterial versus eucaryal polypeptide
sequences, 786 (42%) are more similar to bacterial sequences and
241 (13%) are more similar to eucaryal sequences. The majority of
gene products predicted to be proteins involved in cofactor and
small molecule biosyntheses, intermediary metabolism, transport,
nitrogen fixation, regulatory functions and interactions with the
environment have sequences more similar to bacterial than eucaryal
sequences, whereas many proteins predicted to be involved in DNA
metabolism, transcription, and translation have sequences more
similar to eucaryal than bacterial sequences. Most M.
thermoautotrophicum ORFs appear to be preceded by ribosome
binding sites and ORFs predicted to encode functionally related gene
products are frequently clustered in what appear to be multigene
transcriptional units. These include ORFs that encode polypeptides
with sequences related proteins found in Eucarya but not in Bacteria.
The M. thermoautotrophicum genome is predicted to encode 24
polypetides that could form two-component sensor kinase-response
regulator systems, homologs of the bacterial Hsp70-response
proteins DnaK and DnaJ, homologs of eucaryal DNA replication
initiation Cdc6 proteins, an X-family repair-type DNA polymerase
and an unusual archaeal B-type DNA polymerase formed by two
separate polypeptides encoded by genes that are ~0.65 Mb apart.
These are all molecular features notably absent in M. jannaschii. DNA
replication and genome organization in M. thermoautotrophicum
appear to have eucaryal features, based on the predicted presence of
two Cdc6 homologs and three histones, whereas the presence of an
ftsZ gene indicates a bacterial type of cell division initiation. DNA-
dependent RNA polymerase (RNAP) subunits A', A'', B', B'' and H
are encoded in a typical archaeal RNAP operon, and a second A'
subunit encoding gene, that contains frameshifts, is present at a
remote location. There are two rRNA operons, separated by only
~110 kb, and both contain a tRNAala (UGC) gene between the 16S and
23S rRNA genes. Immediately upstream of one rRNA operon is the 7S
RNA gene and a tRNAser (GCU) gene. There are 39 tRNA genes, ten in
two 5-gene clusters and 16 and 2-gene clusters. The remainder,
apart from the rRNA operon associated tRNA genes, are dispersed
apparently as single gene transcriptional units around the genome.
Introns are present between positions 37 and 38 of the mature
anticodon loop in the elongation tRNAmet, tRNAtrp and tRNApro
(GGG) genes, and the tRNApro (GGG) gene contains a second intron, at
an unprecedented location, between nucleotides 32 and 33 of the
mature anticodon loop. There is no selenocysteinyl-tRNA gene, nor
evidence for classically organized IS elements, prophages or
plasmids. The M. thermoautotrophicum genome contains one intein,
located in the alpha chain of ribonucleoside-diphosphate reductase,
and 2 extended repeats (3.6 kb and 8.6 kb in length) that are
members of a repeat family that has18 representatives in the M.
jannaschii genome. |
| 14 | Henikoff S, Greene EA, Pietrokovski S, Bork P, Attwood TK, Hood L |
| Gene families: the taxonomy of protein paralogs and chimeras |
|
Science 278:609-614, 1997
(Medline ID:
9381171)
|
| Abstract |
|
Ancient duplications and rearrangements of protein-coding segments have
resulted in complex gene family relationships. Duplications can be tandem
or dispersed and can involve entire coding regions or modules that correspond
to folded protein domains. As a result, gene products may acquire new
specificities, altered recognition properties or modified functions. Extreme
proliferation of some families within an organism, perhaps at the expense of
other families, may correspond to functional innovations during evolution.
The underlying processes are still at work, and the large fraction of human
and other genomes consisting of transposable elements may be a manifestation
of the evolutionary benefits of genomic flexibility. |
| 15 | Greene EA, Pietrokovski S, Henikoff S, Bork P, Attwood TK, Hood L, Bairoch A |
| GENOME MAPS 8: Building gene families (wall chart) |
|
Science 278:615, 1997
(Medline ID:
9381172)
|
| Abstract |
|
Genome sequencing projects and other large-scale efforts are generating
hundreds of thousands of sequences of new proteins from diverse organisms.
The task of discovering the structure and function of an unknown protein is
aided by the fact that most new genes are related to other genes, and these
relationships can often be detected via sequence similarity. Perhaps half
of all known genes encode members of some 3000 major families. Family
members share sequence and structural similarities, suggesting divergence
from a common ancestor. Unlike proteins that are direct counterparts in
different organisms, there can be many members of a gene family within one
organism that carry out distinct, yet similar, functions. For the organism
itself, the existence of gene families provides a way of generating diversity
in function and specificity from a limited number of building blocks, which
is essential for the evolutionary success of a genome. Within large eukaryotic
genomes, gene family size varies tremendously, ranging from a unique member to
thousands of members. Even smaller genomes harbor families that comprise
several percent of their genome. |
| 19 | Rose TM, Schultz ER, Henikoff JG, Pietrokovski S, McCallum CM, Henikoff S |
|
Consensus-degenerate hybrid oligonucleotide primers
for amplification of distantly-related sequences
|
|
Nucleic Acids Research, 26:1628-1635 (1998)
(Medline ID:
9512532)
|
| Abstract |
|
We describe a new primer design strategy for PCR amplification of unknown
targets that are related to multiply-aligned protein sequences. Each
primer consists of a short 3' degenerate core region and a longer 5'
consensus clamp region. Only 3-4 highly conserved amino acid residues are
necessary for design of the core, which is stabilized by the clamp during
annealing to template molecules. During later rounds of amplification, the
non-degenerate clamp permits stable annealing to product molecules. We
demonstrate the practical utility of this hybrid primer method by detection
of diverse reverse transcriptase-like genes in a human genome, and by
detection of C5 DNA methyltransferase homologs in various plant DNAs. In
each case, amplified products were sufficiently pure to be cloned without
gel fractionation. This COnsensus-DEgenerate Hybrid Oligonucleotide Primer
(CODEHOP) strategy has been implemented as a computer program that is
accessible over the World-Wide Web (
http://blocks.fhcrc.org/codehop.html)
and is directly linked from the BlockMaker
multiple sequence alignment site for hybrid primer prediction begining with
a set of related protein sequences.
|
| 22 | Kowalski JC, Belfort M, Stapleton SA, Holpert M, Dansereau JT, Pietrokovski S, Baxter S, Derbyshire V |
| Configuration of the catalytic GIY-YIG domain of intron endonuclease I-TevI: coincidence of computational and molecular findings |
|
Nucleic Acids Research 27:2115-2125, 1999
(Medline ID:
10219084)
|
|
|
| Abstract |
|
I-TevI is a member of the GIY-YIG family of homing endonucleases. It is
folded into two structural and functional domains, an N-terminal catalytic
domain and a C-terminal DNA-binding domain, separated by a flexible linker.
In this study we have used genetic analyses, computational sequence analysis
and NMR spectroscopy to define the configuration of the N-terminal domain and
its relationship to the flexible linker. The catalytic domain is an alpha/beta
structure contained within the first 92 amino acids of the 245-amino acid
protein followed by an unstructured linker. Remarkably, this structured
domain corresponds precisely to the GIY-YIG module defined by sequence
comparisons of 57 proteins including more than 30 newly reported members
of the family. Although much of the unstructured linker is not essential
for activity, residues 93-116 are required, raising the possibility that
this region may adopt an alternate conformation upon DNA binding. Two
invariant residues of the GIY-YIG module, Arg27 and Glu75, located in
alpha-helices, have properties of catalytic residues. Furthermore, the
GIY-YIG sequence elements for which the module is named form part of
a three-stranded antiparallel beta-sheet that is important for I-TevI
structure and function. |
| 23 | Henikoff S, Henikoff JG, Pietrokovski S |
| Blocks+: A non-redundant database of protein alignment blocks derived from multiple compilations |
|
Bioinformatics 15:471-479, 1999
(Medline ID:
10383472)
|
|
|
| Abstract |
Motivation: As databanks grow, sequence classification and prediction of
function by searching protein family databases becomes increasingly valuable.
The original Blocks Database, which contains ungapped multiple alignments for
families documented in Prosite, can be searched to classify new sequences.
However, Prosite is incomplete, and families from other databases are now
available to expand coverage of the Blocks Database.
Results: To take advantage of protein family information present in several
existing compilations, we have used five databases to construct Blocks+, a
unified database that is built on the PROTOMAT/BLOSUM scoring model and that
can be searched using a single algorithm for consistent sequence classification.
The LAMA blocks-versus-blocks searching program identifies overlapping protein
families, making possible a non-redundant hierarchical compilation. Blocks+
consists of all blocks derived from PROSITE, blocks from Prints not present
in PROSITE, blocks from Pfam-A not present in PROSITE or Prints, and so on
for ProDom and Domo, for a total of 1995 protein families represented by 8909
blocks, doubling the coverage of the original Blocks Database. A challenge
for any procedure aimed at non-redundancy is to retain related but distinct
families while discarding those that are duplicates. We illustrate how using
multiple compilations can minimize this potential problem by examining the
SNF2 family of ATPases, which is detectably similar to distinct families of
helicases and ATPases.
Availability: http://blocks.fhcrc.org/
Contact: steveh@fhcrc.org. |
| 27 | Sapir T, Horesh D, Caspi M, Atlas R, Burgess HA, Grayer Wolf S, Francis F, Chelly J, Elbaum M, Pietrokovski S, Reiner O |
|
Doublecortin mutations cluster in evolutionarily conserved functional domains
|
|
Human Molecular Genetics 9:703-712, 2000
(Medline ID:
10749977)
|
|
|
| Abstract |
|
Mutations in the X-linked gene doublecortin (DCX) result in lissencephaly
in males or subcortical laminar heterotopia (`double cortex') in females. Various types
of mutation were identified and the sequence differences included nonsense, splice site
and missense mutations throughout the gene. Recently, we and others have demonstrated that
DCX interacts and stabilizes microtubules. Here, we performed a detailed sequence analysis
of DCX and DCX-like proteins from various organisms and defined an evolutionarily conserved
Doublecortin (DC) domain. The domain typically appears in the N-terminus of proteins and
consists of two tandemly repeated 80 amino acid regions. In the large majority of patients,
missense mutations in DCX fall within the conserved regions. We hypothesized that
these repeats may be important for microtubule binding. We expressed DCX or DCLK (KIAA0369)
repeats in vitro and in vivo. Our results suggest that the first repeat binds
tubulin but not microtubules and enhances microtubule polymerization. To study the functional
consequences of DCX mutations, we overexpressed seven of the reported mutations in
COS7 cells and examined their effect on the microtubule cytoskeleton. The results demonstrate
that some of the mutations disrupt microtubules. The most severe effect was observed with
a tyrosine to histidine mutation at amino acid 125 (Y125H). Produced as a recombinant protein,
this mutation disrupts microtubules in vitro at high molar concentration. The positions
of the different mutations are discussed according to the evolutionarily defined DC-repeat
motif. The results from this study emphasize the importance of DCX-microtubule interaction
during normal and abnormal brain development. |
| 28 | Henikoff, JG, Pietrokovski S, McCallum CM, Henikoff S |
|
Blocks-based methods for detecting protein homology
|
|
Electrophoresis 21:1700-1706, 2000
(Medline ID:
10870957)
|
|
|
| Abstract |
|
The most highly conserved regions of proteins can be represented as blocks of aligned
sequence segments, typically with multiple blocks for a given protein family. The Blocks
Database World Wide Web (http://blocks.fhcrc.org) and e-mail (blocks@blocks. fhcrc.org)
servers provide tools to search DNA and protein queries against the Blocks+ Database of
multiple alignments. We describe features for detection of distant relationships using
blocks. Blocks+ includes protein families from the PROSITE, Prints, Pfam-A, ProDom and
Domo databases. Other features include searching Blocks+ with the BLIMPS and NCBI's
IMPALA programs, sequence logos, phylogenetic trees, three-dimensional display of blocks
on PDB structures, and a polymerase chain reaction (PCR) primer design strategy based on blocks.
|
| 29 | Pietrokovski S, Shilo B-Z |
|
Identification of new signaling components in the Drosophila genome sequence
|
|
Functional & Integrative Genomics 1:250-255, 2001 (published online: 14 September 2000)
(Pubmed ID:
11793244)
|
|
|
| Abstract |
|
The availability of the complete sequence of the Drosophila genome, and the assignment of
putative reading frames, provides an opportunity to search for new members in families of
proteins generating signaling cascades. The six major pathways that dictate patterning
were examined: receptor tyrosine kinases, TGFβ, Wnt, Toll, Hedgehog and Notch. Several new
components were identified for the first four pathways, including ligands, receptors,
cytoplasmic components and transcription factors. Most notable is the identification of
a vascular endothelial growth factor (VEGF) receptor tyrosine kinase, two insulin/IGF I
receptors without cytoplasmic protein kinase domains, and a family of proteins similar to
Rhomboid - (a protein involved in cleavage of TGFα-like ligands). A new TGFβ family ligand,
two new Wnts and a Frizzled receptor were also identified. Finally, for the Toll pathway,
two new potential Spatzle-like ligands and two new receptors were identified.
The number of new components is limited, and in the case of the Hedgehog and Notch pathways
no new members were identified. This indicates that for the signaling pathways which determine
pattern formation, the exhaustive genetic screens have identified most of the components.
Thus, functional redundancy between signaling components belonging to the same family is
limited, as mutations in each member usually give rise to a detectable phenotype.
|
| 30 | Kunin V, Chan B, Sitbon E, Lithwick G, Pietrokovski S |
|
Consistency analysis of similarity between multiple alignments -
prediction of protein function and fold structure from analysis of local sequence motifs
|
|
J Molecular Biology 307:939-949, 2001
(Pubmed ID:
11273712)
|
|
|
| Abstract |
|
A new method to analyze the similarity between multiply-aligned protein motifs (blocks) was
developed. It identifies sets of consistently aligned blocks. These are found to be protein
regions of similar function and structure that appear in different contexts. For example, the
Rossmann fold ligand-binding region is found similar to TIM barrel and methylase regions,
various protein families are predicted to have a TIM-barrel fold and the structural relation
between the ClpP protease and crotonase folds is identified from their sequence. Besides
identifying local structure features, sequence similarity across short sequence-regions
(less than twenty amino acids) also predicts structure similarity of whole domains (folds)
a few hundred amino acids long. Most of these relations could not be identified by other
advanced sequence-to-sequence and sequence-to-multiple alignments comparisons. We describe
the method (termed cyrca), present examples of our findings and discuss their implications.
|
| 32 | Adato A, Vreugde S, Joensuu T, Avidan N, Hamalainen R, Belenkiy O, Olender T, Bonne-Tamir B, Ben-Asher E, Espinos C, Mill Lehesjoki A, Flannery JG, Avraham KB, Pietrokovski S, Sankila E, Beckmann JS, Lancet D |
|
USH3A transcripts encode clarin-1, a four-transmembrane-domain protein with a possible role in sensory synapses
|
|
European Journal of Human Genetics 10:339-350, 2002
(Pubmed ID:
12080385)
|
|
|
| Abstract |
|
Usher syndrome type 3 (USH3) is an autosomal recessive disorder characterized by the association
of post-lingual progressive hearing loss, progressive visual loss due to retinitis pigmentosa and
variable presence of vestibular dysfunction. Because the previously defined transcripts do not
account for all USH3 cases, we performed further analysis and revealed the presence of additional
exons embeded in longer human and mouse USH3A transcripts and three novel USH3A mutations.
Expression of Ush3a transcripts was localized by whole mount in situ hybridization to cochlear
hair cells and spiral ganglion cells. The full length USH3A transcript encodes clarin-1, a
four-transmembrane-domain protein, which defines a novel vertebrate-specific family of three
paralogues. Limited sequence homology to stargazin, a cerebellar synapse four-transmembrane-domain
protein, suggests a role for clarin-1 in hair cell and photoreceptor cell synapses, as well as a
common pathophysiological pathway for different Usher syndromes.
|
| 33 | Henikoff JG, Greene EA, Taylor N, Pietrokovski S, Henikoff S |
|
Using the Blocks Database to Recognize Functional Domains
|
|
Current Protocols in Bioinformatics UNIT 2.2, 2002
|
|
|
| Abstract |
|
Blocks are ungapped multiple alignments of of related protein sequence segments that correspond to the most conserved regions of the proteins. The Blocks Database is a collection of blocks representing known protein families that can be used to compare a protein or DNA sequence with documented families of proteins. Protocols in this unit describe the analysis of proteins and families using Blocks-based tools, including searching, exploring relationships with trees, making new blocks, and designing PCR primers from blocks for isolating homologous sequences.
|
| 37 | Amitai G, Dassa B, Pietrokovski S |
|
Protein-splicing of inteins with atypical glutamine and aspartate C-terminal residues
|
|
J Biological Chemistry 279 3121-3131 2004 (published online: October 30, 2003)
(Pubmed ID:
14593103)
|
|
|
| Abstract |
|
Inteins are protein-splicing domains present in many proteins. They self catalyze their
excision from the host protein, ligating their former flanks by a peptide bond. The
C-terminal residue of inteins is typically an asparagine (Asn). Cyclization of this
residue to succinimide causes the final detachment of inteins from their hosts. We
studied protein-splicing activity of two inteins with atypical C-terminal residues.
One having a C-terminal glutamine (Gln), isolated from Chilo-Iridescent virus (CIV),
and another unique intein, first reported here, with a C-terminal aspartate, isolated
from Carboxydothermus hydrogenoformans (Chy). Protein-splicing activity was examined in
the wild-type inteins and in several mutants with N- and C-terminal amino acid substitutions.
We demonstrate that both wild-type inteins can protein-splice, probably by new variations of
the typical protein-splicing mechanism. Substituting the atypical C-terminal residue to the
typical Asn retained protein-splicing only in the CIV intein. All diverse C-terminal
substitutions in the Chy intein (Asp345 to Asn, Gln, Glu, and Ala) abolished protein-splicing
and generated N- and C-terminal cleavage. The observed C-terminal cleavage in the
Chy intein ending with Ala cannot be explained by cyclization of this residue.
We present and discuss several new models for reactions in the protein-splicing pathway.
|
| 38 | Dassa B, Haviv H, Amitai G, Pietrokovski S |
|
Protein splicing and auto-cleavage of bacterial intein-like domains lacking a C'-flanking nucleophilic residue
|
|
J Biological Chemistry 279 32001-32007 2004 (published online: May 18, 2004)
(Pubmed ID:
15150275)
|
|
|
| Abstract |
|
Bacterial intein-like (BIL) domains are newly identified homologs of intein
protein-splicing domains. The two known types of BIL domains together with
inteins and hedgehog (Hog) auto-processing domains form the HINT super-
family. BIL domains are distinct from inteins and Hogs in sequence,
phylogenetic distribution and host protein type, but little is known about
their biochemical activity. Here we experimentally study the auto-
processing activity of four BIL domains. An A-type BIL domain from
Clostridium thermocellum showed both protein-splicing and auto-cleavage
activities. The splicing is notable since this domain has a native Ala C'-
flanking residue, rather than a nucleophilic residue, which is absolutely
necessary for intein protein-splicing. B-type BIL domains from Rhodobacter
sphaeroides and Rhodobacter capsulatus cleaved their N' or C' ends. We
propose an alternative protein-splicing mechanism for A-type BIL domains.
After an initial N-S acyl shift, creating a thioester bond at the domain N'
end, the domain's C' end is cleaved by Asn cyclization. Next, the resulting
amino end of the C' flank attacks the thioester bond at the domain's N'
end. This aminolysis step splices the two flanks of the domain. B-type BIL
domains cleavage activity is explained in context of the canonical intein
protein-splicing mechanism. Our results suggest that the different HINT
domains have related biochemical activities of proteolytic cleavages,
ligation and splicing. Yet the predominant reactions diverged in each HINT
type, according to their specific biological roles. We suggest that BIL
domains cleavage and splicing reactions are mechanisms for post-
translationally generating protein variability, particularly in extra
cellular bacterial proteins.
|
| 40 | Amitai G, Shemesh A, Sitbon E, Shklyar M, Netanely D, Venger I, Pietrokovski S |
|
Network analysis of protein structures identifies functional residues
|
|
J Molecular Biology 344:1136-1145 2004
(published online: November 6, 2004)
(Pubmed ID:
15544817)
|
|
|
| Abstract |
|
Identifying active site residues strictly from protein three-dimensional structure is a difficult task,
especially for proteins that have few or no homologues. We transformed protein structures into residue
interaction graphs (RIGs), where amino acid residues are graph nodes and their interactions with each
other are the graph edges. We found that active site, ligand-binding and evolutionary conserved residues,
typically have high closeness values. Residues with high closeness values interact directly or by a few
intermediates with all other residues of the protein. Combining closeness and surface accessibility
identified active site residues in 70% of 178 representative structures. Detailed structural analysis of
specific enzymes also located other types of functional residues. These include the substrate binding
sites of acetylcholinesterases and subtilisin, and the regions whose structural changes activate MAP
kinase and glycogen phosphorylase. Our approach uses single protein structures, and does not rely on
sequence conservation, comparison to other similar structures or any prior knowledge. Residue closeness
is distinct from various sequence and structure measures and can thus complement them in identifying
key protein residues. Closeness integrates the effect of the entire protein on single residues. Such
natural structural design may be evolutionary maintained to preserve interaction redundancy and
contribute to optimal setting of functional sites.
|
| 42 | Grinberg M, Schwarz M, Zaltsman Y, Eini T, Pietrokovski S, Gross A |
|
Mitochondrial carrier homolog 2 is a target of tBID in cells signaled to die by TNFα
|
|
Molecular Cell Biology 25:4579-4590 2005
(Pubmed ID:
15899861)
|
|
|
| Abstract |
|
BID, a proapoptotic BCL-2 faily member, plays an essential role in the tumor necrosis factor alpha
(TNFα)/Fas death receptor pathway in vivo. Activation of the TNF-R1 receptor results in the cleavage of BID
into truncated BID (tBID), which translocates to the mitochondria and induces the activation of BAX or BAK.
In TNFα -activated FL5.12 cells, tBID becomes part of a 45-kDa cross-linkable mitochondrial complex. Here
we describe the biochemical purification of this complex and the identification of mitochondrial carrier
homolog 2 (Mtch2) as part of this complex. Mtch2 is a conserved protein that is similar to members of the
mitochondrial carrier protein family. Our studies with mouse liver mitochondria indicate that Mtch2 is an
integral membrane protein exposed on the surface of mitochondria. Using blue-native gel electrophoresis we
revealed that in viable FL5.12 cells Mtch2 resides in a protein complex of ca. 18 kDa and that the addition of
TNFα to these cells leads to the recruitment of tBID and BAX to this complex. Importantly, this recruitment
was partially inhibited in FL5.12 cells stably expressing BCL-XL. These results implicate Mtch2 as a
mitochondrial target of tBID and raise the possibility that the Mtch2-resident complex participates in the
mitochondrial apoptotic program.
|
| 44 | Nagasaki K, Shirai Y, Tomaru, Y Nishida K, Pietrokovski S |
|
Algal viruses with distinct intraspecies host specificities include identical intein elements
|
|
Applied and Environmental Microbiology 71:3599-3607 2005
(Pubmed ID:
16000767)
|
|
|
| Abstract |
|
HaV is a large double-stranded DNA virus infecting the single-cell bloom-forming
raphidophyte (golden brown alga) Heterosigma akashiwo. Molecular phylogenetic sequence
analysis of HaV DNA polymerase showed that it forms a sister group with Phycodnaviridae algal
viruses. All ten examined HaV strains, with distinct intraspecies host specificities, included in
their DNA polymerase genes an intein (protein intron). The 232 amino acids inteins differed from
each other by no more than a single nucleotide change. All inteins were present in the same
conserved position, coding for an active-site motif, which also includes inteins in Mimivirus (a
very large double-stranded DNA virus of amoebae), and several archaeal DNA polymerases. The
HaV intein is closely related to the Mimivirus intein and both are apparently monophyletic to the
archaeal inteins. These observations suggest horizontal transfers of inteins between viruses of
different families and between archaea and viruses, and that viruses might be reservoirs and
intermediates in intein horizontal transmissions. The homing endonuclease domain of the HaV
intein alleles is mostly deleted. The mechanism keeping their sequences basically identical in
HaV strains specific for different hosts is yet unknown. One possibility is that rapid and local
changes in the HaV genome change its host specificity. This is the first report of inteins found in
viruses infecting eukaryotic algae.
|
| 46 | Slavikova S, Shy G, Yao Y, Glozman R, Levanony H, Pietrokovski S, Elazar Z, Galili G |
|
The autophagy-associated Atg8 gene family operates both under favourable growth conditions and under starvation stresses in Arabidopsis plants
|
|
J Experimental Botany 56:2839-2849 2005
(Pubmed ID:
16157655)
|
|
|
| Abstract |
|
Arabidopsis plants possess a family of nine AtAtg8 gene homologues of the yeast
autophagy-associated Apg8/Aut7 gene. To gain insight into how these genes function
in plants, first, the expression patterns of five AtAtg8 homologues were analysed
in young Arabidopsis plants grown under favourable growth conditions or following
exposure to prolonged darkness or sugar starvation. Promoters, plus the entire
coding regions (exons and introns) of the AtAtg8 genes, were fused to the
β-glucuronidase reporter gene and transformed into Arabidopsis plants.
In all plants, grown under favourable growth conditions, β-glucuronidase
staining was much more significant in roots than in shoots. Different genes
showed distinct spatial and temporal expression patterns in roots. In some
transgenic plants, β-glucuronidase staining in leaves was induced by
prolonged darkness or sugar starvation. Next, Arabidopsis plants were
transformed with chimeric gene-encoding Atg8f protein fused to N-terminal
green fluorescent protein and C-terminal haemagglutinin epitope tags.
Analysis of these plants showed that, under favourable growth conditions,
the Atg8f protein is efficiently processed and is localized to
autophagosome-resembling structures, both in the cytosol and in the
central vacuole, in a similar manner to its processing and localization
under starvation stresses. Moreover, treatment with a cocktail of proteasome
inhibitors did not prevent the turnover of this protein, implying that its
turnover takes place in the vacuoles, as occurs in yeasts. The results
suggest that, in plants, the cellular processes involving the Atg8 genes
function efficiently in young, non-senescing tissues, both under favourable
growth conditions and under starvation stresses.
|
| 47 | Bakhrat A, Baranes K, Krichevsky O, Rom I, Schlenstedt G, Pietrokovski S,, Raveh D |
|
Nuclear import of Ho endonuclease utilizes two NLS signals and four importins of the ribosomal import system
|
|
J Biological Chemistry 281:12218-12226 2006
(published online: February 28, 2006)
(Pubmed ID:
16507575)
|
|
|
| Abstract |
|
Activity of Ho, the yeast mating switch endonuclease, is restricted to a narrow
time window of the cell cycle. Ho is unstable and despite being a nuclear protein
is exported to the cytoplasm for proteasomal degradation. We report here the
molecular basis for the highly efficient nuclear import of Ho and the relation
between its short half-life and passage through the nucleus. The Ho nuclear import
machinery is functionally redundant, being based on two bipartite nuclear
localization signals (NLSs), recognized by four importins of the ribosomal
import system. Ho degradation is regulated by the DNA damage response and
Ho retained in the cytoplasm is stabilized, implying that Ho acquires its
crucial degradation signals in the nucleus. Ho arose by domestication of a
fungal VMA1 intein. A comparison of the primary sequences of Ho and
fungal VMA1 inteins shows that the Ho NLSs are highly conserved in
all Ho proteins, but are absent from VMA1 inteins. Thus adoption of a
highly efficient import strategy occurred very early in the evolution of Ho.
This may have been a crucial factor in establishment of homothallism in yeast,
and a key event in the rise of the Saccharomyces sensu stricto.
|
| 48 |
Citri A, Harari D, Shochat G, Ramakrishnan P, Gan J, Eisenstein M, Kimchi A, Wallach D, Pietrokovski S,, Yarden Y |
|
Hsp90 recognizes a common surface on client kinases
|
|
J Biological Chemistry 281:14361-14369 2006
(published online: March 21, 2006)
(Pubmed ID:
16551624)
|
|
|
| Abstract |
|
Hsp90 is a highly abundant chaperone, whose clientele includes hundreds of
cellular proteins, many of which are central players in key signal
transduction pathways, and the majority of which are protein kinases. In light
of the variety of Hsp90 clientele, the mechanism of selectivity of the
chaperone towards its client proteins is a major open question. Focusing on
human kinases, we demonstrate that the chaperone recognizes a common surface
in the amino-terminal lobe of kinases from diverse families, including two
newly identified clients, NIK and DAPK, and the oncoprotein
HER2/ErbB-2. Surface electrostatics determine the interaction with the Hsp90
chaperone complex, such that introduction of a negative charge within this
region disrupts recognition. Compiling information on the Hsp90 dependence of
105 protein kinases, including 16 kinases whose relationship to Hsp90 is first
examined in this study, reveals that surface features, rather than a
contiguous amino-acid sequence, define the capacity of the Hsp90 chaperone
machine to recognize client kinases. Analyzing Hsp90 regulation of two major
signaling cascades, the MAP-kinase and PI-3 kinase, leads us to propose that
the selectivity of the chaperone to specific kinases is functional, namely:
Hsp90 controls kinases that function as hubs, integrating multiple
inputs. These lessons bear significance to pharmacological attempts to target
the chaperone in human pathologies, such as cancer.
|
| 49 |
Eyal E, Frenkel-Morgenstern M, Sobolev V,, Pietrokovski S |
|
A pair-to-pair amino acids substitution matrix and its applications for protein structure prediction
|
|
Proteins 67:142-153 2007
(accepted August 2006, published online: January 22, 2007)
(Pubmed ID:
17243158)
|
|
|
| Abstract |
|
We present a new structurally derived pair-to-pair substitution matrix
(P2PMAT). This matrix is constructed from a very large amount of integrated
high quality multiple sequence alignments (Blocks) and protein structures. It
evaluates the likelihoods of all 160,000 pair-to-pair substitutions. P2PMAT
matrix implicitly accounts for evolutionary conservation, correlated
mutations, and residue-residue contact potentials. The usefulness of the
matrix for structural predictions is shown in this article. Predicting protein
residue-residue contacts from sequence information alone, by our method
(P2PConPred) is particularly accurate in the protein cores, where it performs
better than other basic contact prediction methods (increasing accuracy by
25-60%). The method mean accuracy for protein cores is 24% for 59 diverse
families and 34% for a subset of proteins shorter than 100 residues. This is
above the level that was recently shown to be sufficient to significantly
improve ab initio protein structure prediction. We also demonstrate the
ability of our approach to identify native structures within large sets of
(300-2000) protein decoys. On the basis of evolutionary information alone our
method ranks the native structure in the top 0.3% of the decoys in 4/10 of the
sets, and in 8/10 of sets the native structure is ranked in the top 10% of the
decoys. The method can, thus, be used to assist filtering wrong models,
complimenting traditional scoring functions.
|
| 50 |
Dassa B, Amitai G, Caspi J, Schueler-Furman O,, Pietrokovski S |
|
Trans protein splicing of cyanobacterial split inteins in endogenous and exogenous combinations
|
|
Biochemistry 46:322-330 2007
(published online: January 2, 2007)
(Pubmed ID:
17198403)
|
|
|
| Abstract |
|
Inteins are autocatalytic protein domains that post-translationally excise from
protein precursors and ligate their flanking regions with a peptide bond, in a
process called protein splicing. Intein-containing DNA polymerases of
cyanobacteria and nanoarchaea are naturally split into two separate genes at
their intein domain. Such naturally occurring split inteins rapidly
self-associate and reconstitute protein-splicing activity in trans. Here, we
analyze the in vitro protein-splicing activity of three naturally split inteins
from diverse cyanobacteria: Oscillatoria limnetica, Thermosynechococcus
vulcanus, and Nostoc sp. PCC7120. N- and C-terminal halves of these split
inteins were mixed in nine combinations, resulting in three endogenous
(wild-type) and six exogenous combinations. Protein splicing was detected in
all split-intein combinations, despite a 30-50% sequence variation between the
homologous proteins. Splicing activity proceeded under a variety of conditions,
including the presence of denaturants and reductants and high temperature, ionic
strength, and viscosity. Still, in a high concentration of salt (2 M) or urea
(6 M), specific combinations spliced significantly better than others.
Additionally, copper ions were found to inhibit trans splicing in a reversible
double-lock reaction. Our comparative analysis of naturally split inteins in
endogenous and exogenous combinations demonstrates the modularity of trans
protein-splicing elements and their robust activity. It suggests tight
interactions between split-intein halves and conditions for modifying the
specificity of intein parts. These results promote the biotechnological use of
split inteins for controlled assembly of protein fragments either in vivo or
in vitro and under moderate or extreme conditions.
|
| 52 |
Frenkel-Morgenstern M, Magid R, Eyal E, Pietrokovski S |
|
Refining intra-protein contact prediction by graph analysis
|
|
BMC Bioinformatics 8:S6 2007
(published online: May 24, 2007)
(Pubmed ID:
17570865)
|
|
|
| Abstract |
|
BACKGROUND: Accurate prediction of intra-protein residue contacts from sequence
information will allow the prediction of protein structures. Basic predictions
of such specific contacts can be further refined by jointly analyzing
predicted contacts, and by adding information on the relative positions of
contacts in the protein primary sequence.
RESULTS: We introduce a method for graph analysis refinement of intra-protein contacts,
termed GARP. Our previously presented intra-contact prediction method by means
of pair-to-pair substitution matrix (P2PConPred) was used to test the GARP
method. In our approach, the top contact predictions obtained by a basic
prediction method were used as edges to create a weighted graph. The edges
were scored by a mutual clustering coefficient that identifies highly
connected graph regions, and by the density of edges between the sequence
regions of the edge nodes. A test set of 57 proteins with known structures was
used to determine contacts. GARP improves the accuracy of the P2PConPred basic
prediction method in whole proteins from 12% to 18%.
CONCLUSION: Using a simple approach we increased the contact prediction accuracy of a
basic method by 1.5 times. Our graph approach is simple to implement, can be
used with various basic prediction methods, and can provide input for further
downstream analyses.
|
| 54 |
Shoval Y, Pietrokovski S, Kimchi A |
|
ZIPk: a unique case of murine-specific divergence of a conserved vertebrate gene
|
|
PLoS
Genetics 3:e180 doi:10.1371/journal.pgen.0030180 2007
(published online: September 7, 2007)
(Pubmed ID:
17953487)
|
|
|
| Abstract |
|
Zipper interacting protein kinase (ZIPK, also known as death-associated
protein kinase 3 [DAPK3]) is a Ser/Thr kinase that functions in programmed
cell death. Since its identification eight years ago, contradictory findings
regarding its intracellular localization and molecular mode of action have
been reported, which may be attributed to unpredicted differences among the
human and rodent orthologs. By aligning the sequences of all available ZIPK
orthologs, from fish to human, we discovered that rat and mouse sequences are
more diverged from the human ortholog relative to other, more distant,
vertebrates. To test experimentally the outcome of this sequence divergence,
we compared rat ZIPK to human ZIPK in the same cellular settings. We found
that while ectopically expressed human ZIPK localized to the cytoplasm and
induced membrane blebbing, rat ZIPK localized exclusively within nuclei,
mainly to promyelocytic leukemia oncogenic bodies, and induced significantly
lower levels of membrane blebbing. Among the unique murine (rat and mouse)
sequence features, we found that a highly conserved phosphorylation site,
previously shown to have an effect on the cellular localization of human ZIPK,
is absent in murines but not in earlier diverging organisms. Recreating this
phosphorylation site in rat ZIPK led to a significant reduction in its
promyelocytic leukemia oncogenic body localization, yet did not confer full
cytoplasmic localization. Additionally, we found that while rat ZIPK interacts
with PAR-4 (also known as PAWR) very efficiently, human ZIPK fails to do so.
This interaction has clear functional implications, as coexpression of PAR-4
with rat ZIPK caused nuclear to cytoplasm translocation and induced strong
membrane blebbing, thus providing the murine protein a possible adaptive
mechanism to compensate for its sequence divergence. We have also cloned
zebrafish ZIPK and found that, like the human and unlike the murine orthologs,
it localizes to the cytoplasm, and fails to bind the highly conserved PAR-4
protein. This further supports the hypothesis that murine ZIPK underwent
specific divergence from a conserved consensus. In conclusion, we present
a case of species-specific divergence occurring in a specific branch of the
evolutionary tree, accompanied by the acquisition of a unique protein-protein
interaction that enables conservation of cellular function.
|
| 55 |
Ilouz R, Pietrokovski S, Eisenstein M, Eldar-Finkelman H |
|
New insights into the autoinhibition mechanism of glycogen synthase kinase-3β
|
|
J Molecular Biology 383:999-1007 (2008)
(published online: 9 September, 2008)
(Pubmed ID:
18793648)
|
|
|
| Abstract |
|
It has been suggested that phosphorylation at serine 9 near the N-terminus of
glycogen synthase kinase-3β (GSK-3β) mimics the prephosphorylation of
its substrate and, therefore, the N-terminus functions as a
pseudosubstrate. The molecular basis for the pseudosubstrate's binding to the
catalytic core and autoinhibition has not been fully defined. Here, we
combined biochemical and computational analyses to identify the potential
residues within the N-terminus and the catalytic core engaged in
autoinhibition of GSK-3β. Bioinformatic analysis found Arg4, Arg6, and Ser9
in the pseudosubstrate sequence to be extremely conserved through
evolution. Mutations at Arg4 and Arg6 to alanine enhanced GSK-3β kinase
activity and impaired its ability to autophosphorylate at Ser9. In addition,
and unlike wild-type GSK-3β, these mutants were unable to undergo
autoinhibition by phosphorylated Ser9. We further show that Gln89 and Asn95,
located within the catalytic core, interact with the pseudosubstrate. Mutation
at these sites prevented inhibition by phosphorylated Ser9. Furthermore, the
respective mutants were not inhibited by a phosphorylated pseudosubstrate
peptide inhibitor. Finally, computational docking of the pseudosubstrate into
the catalytic active site of the kinase suggested specific interactions
between Arg6 and Asn95 and of Arg4 to Asp181 (apart from the interaction of
phosphorylated serine 9 with the "phosphate binding pocket"). Altogether, our
study supports a model of GSK-3-pseudosubstrate autoregulation that involves
phosphorylated Ser9, Arg4, and Arg6 within the N-terminus and identified the
specific contact sites within the catalytic core.
|
| 56 |
Dori-Bachash M, Dassa B, Pietrokovski S, Jurkevitch E |
|
Proteome-based comparative analyses of growth stages reveal new cell-cycle dependent functions in the predatory bacterium Bdellovibrio bacteriovorus
|
|
Applied and Environmental Microbiology 74:7152-7162 (2008)
(published online: 3 October, 2008)
(Pubmed ID:
18836011)
|
|
|
| Abstract |
|
Bdellovibrio and like organisms are obligate predators of bacteria, that are
ubiquitously found in the environment. Most exhibit a peculiar dimorphic
life-cycle during which free swimming attack phase (AP) cells search and
invade bacterial prey cells. The invader develops in the prey as a filamentous
polynucleoid-containing cell that finally splits into progeny
cells. Therapeutic and biocontrol applications of Bdellovibrio in human and
animal, and plant health, respectively, have been proposed but more knowledge
on this peculiar cell cycle is needed to develop such applications. A
proteomic approach was applied to study cell cycle dependent expression of the
Bdellovibrio bacteriovorus' proteome in synchronous cultures of a facultative
host-independent (HI) strain able to grow in the absence of
prey. Two-dimensional gel electrophoresis, mass spectrometry and temporal
expression of selected genes in predicted operons were analyzed. In total,
about 21% of the in-silico predicted proteome was covered. One hundred and
ninety six proteins were identified, including 63 hitherto unknown proteins
and 140 life stage-dependent spots. Of those, 47 were differentially
expressed, including chemotaxis, attachment, growth and replication-related,
cell wall and regulatory proteins. Novel cell cycle-dependent adhesion,
gliding, mechanosensing, signaling and hydrolytic functions were assigned. The
HI model was further studied by comparing HI and wild-type AP-cells, revealing
that proteins involved in DNA replication and signaling were deregulated in
the former. A complementary analysis of the secreted proteome identified 59
polypeptides, including cell contact proteins and hydrolytic enzymes specific
to predatory bacteria.
|
| 57 |
Dori-Bachash M, Dassa B, Peleg O, Pineiro SA, Jurkevitch E, Pietrokovski S |
|
Bacterial intein-like domains of predatory bacteria: a new domain type characterized in Bdellovibrio bacteriovorus
|
|
Functional & Integrative Genomics 9:153-166 (2009)
(published online: 20 January, 2009)
(Pubmed ID:
19153786)
|
|
|
| Abstract |
|
We report a new family of bacterial intein-like domains (BILs) identified in
ten proteins of four diverse predatory bacteria. BILs belong to the HINT
(Hedgehog/Intein) superfamily of domains that post-translationally
self-process their protein molecules by protein splicing and
self-cleavage. The new, C-type, BILs appear with other domains, including
putative predator-specific domain 1 (PPS-1), a new domain typically appearing
immediately upstream of C-type BILs. The Bd2400 protein of the obligate
predator Bdellovibrio bacteriovorus includes a C-type BIL and a PPS-1 domains
at its C-terminal part, and a signal peptide and two polycystic kidney disease
domains at its N-terminal part. We demonstrate the in vivo transcription,
translation, secretion, and processing of the B. bacteriovorus protein, and
the in vitro autocatalytic N-terminal cleavage activity of its C-type
BIL. Interestingly, whereas the Bd2400 gene is constitutively expressed, its
protein product is differentially processed throughout the dimorphic life
cycle of the B. bacteriovorus predator. The modular structure of the protein,
its localization, and complex processing suggest that it may be involved in
the interaction between the predator and its prey.
|
| 58 |
Dassa B, London N, Stoddard BL, Schueler-Furman O, Pietrokovski S |
|
Fractured genes: a novel genomic arrangement involving new split inteins and a new homing endonuclease family
|
|
Nucleic Acids Research 37:2560-2573 (2009)
(published online: 5 March, 2009)
(Pubmed ID:
19264795)
|
|
|
| Abstract |
|
Inteins are genetic elements, inserted in-frame into protein-coding genes,
whose products catalyze their removal from the protein precursor via a
protein-splicing reaction. Intein domains can be split into two fragments and
still ligate their flanks by a trans-protein-splicing reaction. A
bioinformatic analysis of environmental metagenomic data revealed 26 different
loci with a novel genomic arrangement. In each locus, a conserved enzyme
coding region is broken in two by a split intein, with a free-standing
endonuclease gene inserted in between. Eight types of DNA synthesis and repair
enzymes have this 'fractured' organization. The new types of naturally
split-inteins were analyzed in comparison to known split-inteins. Some loci
include apparent gene control elements brought in with the endonuclease
gene. A newly predicted homing endonuclease family, related to very-short
patch repair (Vsr) endonucleases, was found in half of the loci. These
putative homing endonucleases also appear in group-I introns, and as
stand-alone inserts in the absence of surrounding intervening sequences. The
new fractured genes organization appears to be present mainly in phage, shows
how endonucleases can integrate into inteins, and may represent a missing link
in the evolution of gene breaking in general, and in the creation of
split-inteins in particular.
|
| 59 |
Eldar-Finkelman H, Licht-Murava A, Pietrokovski S, Eisenstein M |
|
Substrate Competitive GSK-3 Inhibitors strategy and Implications
|
|
Biochim Biophys Acta 1804:598-603 (2010)
(published online: 18 September, 2009)
(Pubmed ID:
19770076)
|
|
|
| Abstract |
|
Glycogen synthase kinase-3 (GSK-3) is a highly conserved protein
serine/threonine kinase ubiquitously distributed in eukaryotes as a
constitutively active enzyme. Abnormally high GSK-3 activity has been
implicated in several pathological disorders, including diabetes and neuron
degenerative and affective disorders. This led to the hypothesis that
inhibition of GSK-3 may have therapeutic benefit. Most GSK-3 inhibitors
developed so far compete with ATP and often show limited specificity. Our goal
is to develop inhibitors that compete with GSK-3 substrates, as this type of
inhibitor is more specific and may be useful for clinical applications. We
have employed computational, biochemical, and molecular analyses to gain
in-depth understanding of GSK-3's substrate recognition. Here we argue that
GSK-3 is a promising drug discovery target and describe the strategy and
practice for developing specific substrate-competitive inhibitors of GSK-3.
|
| 60 |
Salzberg Y, Eldar T, Karminsky O, Bar-Sheshet Itach S, Pietrokovski S, Don J |
|
Meig1 Deficiency Causes a Severe Defect in Mouse Spermatogenesis
|
Developmental Biology 338:158-167 (2010)
(published online 22 Nov. 2009)
(Pubmed ID:
20004656)
|
|
|
| Abstract |
|
Meig1 is a mouse gene, abundantly expressed in the testis. It encodes two
alternative transcripts that are expressed differentially in the somatic and
germinal compartments of the testis. These transcripts share the same coding
region but differ in their 5’ un-translated regions, due to alternative
promoters. Here we show that MEIG1 is a highly conserved short metazoan
protein with a conserved core of 81 residues. It is present from chordates to
radial symmetry animals, with an intriguing absence in insects and
nematodes. It is also present in two earlier diverging protist lineages. To
elucidate the role of MEIG1 during gamete production we established a knockout
mouse line by eliminating the common coding region. Our results identified
Meig1 as a critical spermatogenic gene, whose absence results in complete male
infertility. Differentiation of spermatocytes to haploid spermatids seemed
complete, although with significantly increased apoptosis of germ cells, but
further differentiation into later stages was generally blocked. The caudal
epididymis was apparently missing spermatozoa, and the very few that were
obtained were immotile and exhibited a wide range of severe morphological
abnormalities. Meig1 is, therefore, a highly conserved gene which is
indispensable for sperm production and hence for male fertility in mice.
|
| 61 |
Tori K, Dassa B, Johnson MA, Southworth MW, Brace LE, Ishino Y, Pietrokovski S, Perler FB
|
|
Splicing of the Mycobacteriophage Bethlehem DnaB intein: identification of a new mechanistic class of inteins that contain an obligate Block F nucleophile
|
|
J Biological Chemistry
285:2515-2526 (2010)
(published online 23 Nov. 2009)
(Pubmed ID:
19940146)
|
|
|
| Abstract |
|
Inteins are single turnover enzymes that splice out of protein precursors
during maturation of the host protein (extein). The Cys or Ser at the
N-terminus of most inteins initiates a four-step protein splicing reaction by
forming a (thio)ester bond at the N-terminal splice junction. Several recently
identified inteins cannot perform this acyl rearrangement because they do not
begin with Cys, Thr or Ser. This study analyzes one of these, the
Mycobacteriophage Bethlehem DnaB intein, which we describe here as the
prototype for a new class of inteins based on sequence comparisons, reactivity
and mechanism. These Class 3 inteins are characterized by a non-nucleophilic
N-terminal residue that co-varies with a non-contiguous Trp, Cys, Thr triplet
(WCT) and a Thr or Ser as the first C-extein residue. Several mechanistic
differences were observed compared to standard inteins or previously studied
atypical KlbA Ala1 inteins: (a) cleavage at the N-terminal splice junction in
the absence of all standard N- and C-terminal splice junction nucleophiles,
(b) activation of the N-terminal splice junction by a variant Block B motif
that includes the WCT triplet Trp, (c) decay of the branched intermediate by
thiols or Cys despite an ester linkage at the C-extein branch point, and (d)
an absolute requirement for the WCT triplet Block F Cys. Based on biochemical
data and confirmed by molecular modeling, we propose roles for these newly
identified conserved residues, a novel protein splicing mechanism that
includes a second branched intermediate, and an intein classification with 3
mechanistic categories.
|
| 62 |
Harel A, Dalah I, Pietrokovski S, Safran M, Lancet D
|
|
Omics Data Management and annotation
|
|
In:
Bioinformatics for Omics Data,
Series:Methods in Molecular Biology Volume 719, Edited by Mayer, B.Humana
Press, New York, NY (2011) ISBN: 978-1-61779-026-3
(Pubmed ID:
21370079)
|
|
|
| Abstract |
|
Technological Omics breakthroughs, including next generation sequencing, bring
avalanches of data which need to undergo effective data management to ensure
integrity, security, and maximal knowledge-gleaning. Data management system
requirements include flexible input formats, diverse data entry mechanisms and
views, user friendliness, attention to standards, hardware and software
platform definition, as well as robustness. Relevant solutions elaborated by
the scientific community include Laboratory Information Management Systems
(LIMS) and standardization protocols facilitating data sharing and
managing. In project planning, special consideration has to be made when
choosing relevant Omics annotation sources, since many of them overlap and
require sophisticated integration heuristics. The data modeling step defines
and categorizes the data into objects (e.g. genes, articles, disorders) and
creates an application flow. A data storage/warehouse mechanism must be
selected such as file-based systems and relational databases, the latter
typically used for larger projects. Omics project life cycle considerations
must include the definition and deployment of new versions, incorporating
either full or partial updates. Finally, quality assurance procedures must
validate data and feature integrity, as well as system performance
expectations. These data management principles are illustrated with examples
from the life cycle of the GeneCards Omics project (www.genecards.org) a
comprehensive, widely used compendium of annotative information about human
genes. For example, the GeneCards infrastructure has recently been changed
from text files to relational database, enabling better organization and views
of the growing data. Omics data handling benefits from the wealth of web-based
information, the vast amount of public domain software, increasingly
affordable hardware, and effective use of data management and annotation
principles as outlined in this chapter.
|
| 63 |
Wurtzel O, Dori-Bachash M, Pietrokovski S, Jurkevitch E, Sorek R
|
|
Mutation detection with next-generation resequencing through a mediator genome
|
|
PLoS One 5:e15628 (2010)
(published online 31 Dec. 2010)
(Pubmed ID:
21209874)
|
|
|
| Abstract |
|
The affordability of next generation sequencing (NGS) is transforming the
field of mutation analysis in bacteria. The genetic basis for phenotype
alteration can be identified directly by sequencing the entire genome of the
mutant and comparing it to the wild-type (WT) genome, thus identifying
acquired mutations. A major limitation for this approach is the need for an
a-priori sequenced reference genome for the WT organism, as the short reads of
most current NGS approaches usually prohibit de-novo genome assembly. To
overcome this limitation we propose a general framework that utilizes the
genome of relative organisms as mediators for comparing WT and mutant
bacteria. Under this framework, both mutant and WT genomes are sequenced with
NGS, and the short sequencing reads are mapped to the mediator
genome. Variations between the mutant and the mediator that recur in the WT
are ignored, thus pinpointing the differences between the mutant and the
WT. To validate this approach we sequenced the genome of Bdellovibrio
bacteriovorus 109J, an obligatory bacterial predator, and its prey-independent
mutant, and compared both to the mediator species Bdellovibrio bacteriovorus
HD100. Although the mutant and the mediator sequences differed in more than
28,000 nucleotide positions, our approach enabled pinpointing the single
causative mutation. Experimental validation in 53 additional mutants further
established the implicated gene. Our approach extends the applicability of
NGS-based mutant analyses beyond the domain of available reference genomes.
|
| 64 |
Azoulay-Alfaguter I, Yaffe Y, Licht-Murava A, Urbanska M, Jaworski J, Pietrokovski S, Hirschberg K, Eldar-Finkelman H
|
|
Distinct molecular regulation of GSK-3α isozyme controlled by its N-terminal region. Functional role in calcium/calpain signaling
|
|
J. Biological Chemistry 286:13470-13480 (2011)
(Published online January 25, 2011)
(Pubmed ID:
21266584)
|
|
|
| Abstract |
|
Glycogen synthase kinase-3 is expressed as two isozymes alpha and beta. They
share high similarity in their catalytic domains, but differ in their N- and
C-terminal regions, with GSK-3α having an extended glycine-rich
N-terminus. Here we undertook live cell imaging combined with molecular and
bioinformatics studies to understand the distinct functions of the GSK-3
isozymes focusing on GSK-3α-N-terminal region. We found that unlike
GSK-3β, which shuttles between the nucleus and cytoplasm, GSK-3α was
excluded from the nucleus. Deletion of the N-terminal region of GSK-3α
resulted in nuclear localization, and treatment with Leptomycin B led to
accumulation of GSK-3α in the nucleus. GSK-3α rapidly accumulated in
the nucleus in response to calcium or serum deprivation, and accumulation was
strongly inhibited by the calpain inhibitor calpeptin. This nuclear
accumulation was not mediated by cleavage of the N-terminal region or
phosphorylation of GSK-3α. Rather, we show that calcium-induced GSK-3α
nuclear accumulation was governed by GSK-3α-binding with as yet unknown
calpain-sensitive protein or proteins; this binding was mediated by the
N-terminal region. Bioinformatic and experimental analyses indicated that
nuclear exclusion of GSK-3α was likely an exclusive characteristic of
mammalian GSK-3α. Finally, we show that nuclear localization of GSK-3α
reduced the nuclear pool of β-catenin and its target cyclin D1. Taken
together, these data suggest that the N-terminal region of GSK-3α is
responsible for its nuclear exclusion and that binding with a calcium/calpain
sensitive product enables GSK-3α nuclear retention. We further uncovered a
novel link between calcium and nuclear GSK-3α-mediated inhibition of the
canonical Wnt/β-catenin pathway.
|
| 65 |
Shoval Y, Berissi H, Kimchi A, Pietrokovski S
|
|
New modularity of DAP-kinases: alternative splicing of the DRP-1 gene produces a ZIPk-like isoform
|
|
PLoS ONE 6:e17344 (2011)
(published online 8 Mar. 2011)
(Pubmed ID:
21408167)
|
|
|
| Abstract |
|
DRP-1 and ZIPk are two members of the Death Associated Protein Ser/Thr Kinase
(DAP-kinase) family, which function in different settings of cell death
including autophagy. DAP kinases are very similar in their catalytic domains
but differ substantially in their extra-catalytic domains. This difference is
crucial for the significantly different modes of regulation and function among
DAP kinases. Here we report the identification of a novel alternatively
spliced kinase isoform of the DRP-1 gene, termed DRP-1β. The alternative
splicing event replaces the whole extra catalytic domain of DRP-1 with a
single coding exon that is closely related to the sequence of the extra
catalytic domain of ZIPk. As a consequence, DRP-1β lacks the calmodulin
regulatory domain of DRP-1, and instead contains a leucine-zipper-like motif
similar to the protein binding region of ZIPk. Several functional assays
proved that this new isoform retained the biochemical and cellular properties
which are common to DRP-1 and ZIPk, including myosin light chain
phosphorylation, and activation of membrane blebbing and autophagy. In
addition, DRP-1β also acquired binding to the ATF4 transcription factor, a
feature characteristic of ZIPk but not DRP-1. Thus, a splicing event of the
DRP-1 produces a ZIPk like isoform. DRP-1β is highly conserved in evolution,
present in all known vertebrate DRP-1 loci. We detected the corresponding mRNA
and protein in embryonic mouse brains and in human embryonic stem cells thus
confirming the in vivo utilization of this isoform. The discovery of module
conservation within the DAPk family members illustrates a parsimonious way to
increase the functional complexity within protein families. It also provides
crucial data for modeling the expansion and evolution of DAP kinase proteins
within vertebrates, suggesting that DRP-1 and ZIPk most likely evolved from
their ancient ancestor gene DAPk by two gene duplication events that occurred
close to the emergence of vertebrates.
|
| 66 |
Bialik S, Pietrokovski S, Kimchi A
|
|
Myosin drives autophagy in a pathway linking Atg1 to Atg9
|
|
EMBO J. 30 629-630 (2011)
(Pubmed ID:
21326172)
|
|
|
| Abstract |
|
Autophagy is a cellular process in which specialized autodegradative vesicles,
the autophagosomes, are formed. Much progress has been made in understanding
the molecular mechanism controlling autophagy, particularly the role of the
Atg genes. In this issue, Tang et al identify a signalling pathway linking two
main regulators, the Atg1 kinase—essential for the induction of the
autophagosome—and the transmembrane protein Atg9, whose shuttling between the
Golgi and the forming autophagosme provides a source of membrane for the new
vesicle. This study provides the missing piece of the puzzle: Atg1
phosphorylates and activates a myosin light chain kinase, which in turn
activates myosin to drive transport of Atg9.
|
| 67 |
Tsaadon Alon L, Pietrokovski S, Barkan S, Avrahami L, Kaidanovich-Beilin O, Woodgett JR, Barnea A, Eldar-Finkelman H
|
|
Selective loss of glycogen synthase kinase-3α in birds reveals distinct roles for GSK-3 isozymes in tau phosphorylation
|
|
FEBS L. 585:1158-1162 (2011) (published online 16 Mar. 2011)
(Pubmed ID:
21419127)
|
|
|
| Abstract |
|
Mammalian glycogen synthase kinase-3 (GSK-3) is a critical regulator of neuronal signaling,
cognition and behavior. It exists as two isozymes GSK-3α and GSK-3β, but their distinct
biological functions are not fully known. Here, we examined the evolutionary significance of
each of these isozymes. Surprisingly, we found that unlike other vertebrates that harbor
both GSK-3 genes, the GSK-3α gene is missing in birds. GSK-3-mediated tau phosphorylation was
significantly lower in bird brains than in mouse brains, a phenomenon that was reproduced in GSK-3α
knockout mice. In bird embryos tau was strongly phosphorylated, altogether, suggesting that GSK-3
isozymes play distinct roles in tau phosphorylation. Birds are GSK-3α knockout organisms and may
serve as a novel model to study the distinct functions of GSK-3 isozymes.
|
| 68 |
Shpilka T, Weidberg H, Pietrokovski S, Elazar, Z
|
|
Atg8: an autophagy-related ubiquitin-like protein family
|
|
Genome Biology 12:226 (2011) (published online 27 July 2011)
(Pubmed ID:
21867568)
|
|
|
| Abstract |
|
Autophagy-related (Atg) proteins are eukaryotic factors participating in
various stages of the autophagic process. Thus far 34 Atgs have been
identified in yeast, including the key autophagic protein Atg8. The Atg8 gene
family encodes ubiquitin-like proteins that share a similar structure
consisting of two amino-terminal α helices and a ubiquitin-like core. Atg8
family members are expressed in various tissues, where they participate in
multiple cellular processes, such as intracellular membrane trafficking and
autophagy. Their role in autophagy has been intensively studied. Atg8 proteins
undergo a unique ubiquitin-like conjugation to phosphatidylethanolamine on the
autophagic membrane, a process essential for autophagosome formation. Whereas
yeast has a single Atg8 gene, many other eukaryotes contain multiple Atg8
orthologs. Atg8 genes of multicellular animals can be divided, by sequence
similarities, into three subfamilies: microtubule-associated protein 1 light
chain 3 (MAP1LC3 or LC3), γ-aminobutyric acid receptor-associated protein
(GABARAP) and Golgi-associated ATPase enhancer of 16 kDa (GATE-16), which are
present in sponges, cnidarians (such as sea anemones, corals and hydras) and
bilateral animals. Although genes from all three subfamilies are found in
vertebrates, some invertebrate lineages have lost the genes from one or two
subfamilies. The amino terminus of Atg8 proteins varies between the
subfamilies and has a regulatory role in their various functions. Here we
discuss the evolution of Atg8 proteins and summarize the current view of their
function in intracellular trafficking and autophagy from a structural
perspective.
|
| 69 |
Taylor GK, Heiter DF, Pietrokovski S, Stoddard BL
|
|
Activity, specificity and structure of I-Bth0305I: a representative of a new homing endonuclease family
|
|
Nucleic Acids Research 39:9705-9719 (2011) (published online 2 September 2011)
(Pubmed ID:
21890897)
|
|
|
| Abstract |
|
Novel family of putative homing endonuclease genes was recently discovered
during analyses of metagenomic and genomic sequence data. One such protein is
encoded within a group I intron that resides in the recA gene of the Bacillus
thuringiensis 03058-36 bacteriophage. Named I-Bth0305I, the endonuclease
cleaves a DNA target in the uninterrupted recA gene at a position immediately
adjacent to the intron insertion site. The enzyme displays a multidomain,
homodimeric architecture and footprints a DNA region of ~60 bp. Its highest
specificity corresponds to a 14-bp pseudopalindromic sequence that is directly
centered across the DNA cleavage site. Unlike many homing endonucleases, the
specificity profile of the enzyme is evenly distributed across much of its
target site, such that few single base pair substitutions cause a significant
decrease in cleavage activity. A crystal structure of its C-terminal domain
confirms a nuclease fold that is homologous to very short patch repair (Vsr)
endonucleases. The domain architecture and DNA recognition profile displayed
by I-Bth0305I, which is the prototype of a homing lineage that we term the
'EDxHD' family, are distinct from previously characterized homing endonucleases.
|
| 70 |
Samach A, Melamed-Bessudo C, Avivi-Ragolski N, Pietrokovski S, Levy AA, |
|
Identification of plant RAD52 homologs and characterization of the Arabidopsis thaliana RAD52-Like genes
|
|
The Plant Cell doi/10.1105/tpc.111.091744 (2011) (published online December 2011)
(Pubmed ID:
22202891)
|
|
|
| Abstract |
|
RAD52 mediates RAD51 loading onto single-stranded DNA ends, thereby initiating
homologous recombination and catalyzing DNA annealing. RAD52 is highly conserved
among eukaryotes, including animals and fungi. This article reports that RAD52
homologs are present in all plants whose genomes have undergone extensive sequencing.
Computational analyses suggest a very early RAD52 gene duplication, followed by later
lineage-specific duplications, during the evolution of higher plants. Plant
RAD52 proteins have high sequence similarity to the oligomerization and DNA
binding N-terminal domain of RAD52 proteins. Remarkably, the two identified
Arabidopsis thaliana RAD52 genes encode four open reading frames (ORFs)
through differential splicing, each of which specifically localized to the
nucleus, mitochondria, or chloroplast. The A. thaliana RAD52-1A ORF provided
partial complementation to the yeast rad52 mutant. A. thaliana mutants and RNA
interference lines defective in the expression of RAD52-1 or RAD52-2 showed
reduced fertility, sensitivity to mitomycin C, and decreased levels of
intrachromosomal recombination compared with the wild type. In summary,
computational and experimental analyses provide clear evidence for the
presence of functional RAD52 DNA-repair homologs in plants.
|