君子有所为,有所不为。但科学界近年来小人频出,所作所为不择手段,触目惊心。其中 “生物学家”周国城(Kuo-Chen Chou)的骗技, 端的是炉火纯青,登峰造极,无人能出其右。其“成就”蜚声中外,名噪一时,长年在科学高引率名单上霸榜,在国内各大学和科研机构混得风生水起。
但鲁迅却说,骗子有术,也有效,但也有限。
2019年,Bioinformatics(生物信息学杂志)开始调查周国城的骗术,接着Journal of Theoretical Biology(理论生物学杂志)也开始调查,2020年Nature(自然杂志)刊登专文揭露周国城的骗术。中国媒介也有所转载。
然而很多科学杂志投鼠忌器,不肯详述细节,往往语焉不详,所以本文是首次公布这些细节。因为周国城乃是数十年的惯犯,劣迹斑斑,每次作案,文本都很长,难以一一列举,所以本文只提供一个有代表性的特列来深入剖析。
下面的“Reviewer #2”便是周国城,要求作者引用他和他儿子的一百多篇文章。杂志名改成了"****",免得影响杂志的名声。
周国城怕作者嫌麻烦,所以就越厨代庖,给作者写了五段话,把这一百多篇文章都引用了。这五段话,其实都是胡说八道,但只要作者在文章中插入这五段,周国城就会强烈推荐这篇文章。
这一百多篇文章都是谁写的?读者可能会发现第1到第18篇文章的作者不是周国城(K. C. Chou),而是J. J. Chou。这J. J. Chou何许人也?却原来是周国城的儿子。其爱子之心,却也情有可原。
引用了J. J. Chou的文章后,便轮到周国城(K. C. Chou)自己了。我把Chou字用红色标记,方便读者查找。
眼尖的读者也许会发现有的文献中Chou并没有出现在作者之中,如引用文献列表中28-33,37-41,43,45-56,58-63,66-67,70-72,等等。我原来也以为周国城想掩人耳目,也列了一些别人的文章。没想到所有这些文章,竟也都是周国城的文章。他只是把他自己的名字给隐去了。这些文章的引用率,最终都会归于他的名下,与他在文献列表中是否列出自己的名字并无关系。
周国城不光要求作者引用他的文章,而且要求作者把他的名字加到文章的题目中去,以增加他的知名度。有时候周国城甚至会要求作者把他的名字也加入到文章的作者之中。
科学工作者辛辛苦苦发表一篇文章,通常也不过十几或几十次被引。这周国城审稿一次,便可增加一百多次的引用,所以在高被引榜单上霸榜多年。
周国城的骗术,为什么能成功?如果杂志的编辑会看文章或审稿意见的话,马上就会发现周国城的伎俩,然而如今的科学杂志的编辑大多是滥竽充数,审稿人更是形同虚设。于是小人辈出,骗术横行。学术界乌烟瘴气,触目惊心。
君子有所为,有所不为。但小人无所不为。
周国城已于数年前去世。惟愿一个周国城倒下去,不会有千万个周国城站起来。
---
Reviewer #2:
This is a very interesting paper and hence holds very high potential for publication. But to meet the increasingly high quality standard of ****, a compulsory major revision is absolutely needed according to the following points.
(1) For quite a long period of time, the information derived by the computational approaches (or “in silico” methods) were not trusted very much by most experimental scientists due to the notorious local minimum problem. Actually, they only trusted the results determined by the experiments, and thought computational results were not reliable. The situation has been changed during the last decade or so since the rapid development of structural bioinformatics and sequential bioinformatics. To justify their computational structures, the authors should take the advantage of adding a prelude, such as: “The knowledge of protein 3D (three-dimensional) structures or their complexes with ligands is vitally important for rational drug design. Although X-ray crystallography is a powerful tool in determining these structures, it is time-consuming and expensive, and not all proteins can be successfully crystallized. Membrane proteins are difficult to crystallize and most of them will not dissolve in normal solvents. Therefore, so far very few membrane protein structures have been determined. NMR is indeed a very powerful tool in determining the 3D structures of membrane proteins (see, e.g., [1-4, 5{Chou, 2001 #569, 6-18]), but it is also time-consuming and costly. To acquire the structural information in a timely manner, a series of 3D protein structures have been developed by means of structural bioinformatics tools (see, e.g., [19-31]). Meanwhile, facing the explosive growth of biological sequences discovered in the post-genomic age, to timely use them for drug development, a lot of important sequence-based information, such as PTM (posttranslational modification) sites in proteins [32-36], protein-drug interaction in cellular networking [37], protein-protein interactions [38], DNA-methylation sites [39], recombination spots [40], and sigma-54 promoters [41], have been deducted by various sequential bioinformatics tools such as PseAAC approach [42] and PseKNC approach [43]. Actually, the rapid development in sequential bioinformatics and structural bioinformatics have driven the medicinal chemistry undergoing an unprecedented revolution [44], in which the computational biology has played increasingly important roles in stimulating the development of finding novel drugs. In view of this, the computational (or in silico) methods were also utilized in this study .” Adding the above discussion and citing the references therein will greatly help justify the author’s computational approach.
(2) To make the structure of this paper clearer and easier for readers to follow, the authors should in the end of the Introduction (or right before the beginning of describing their own method) add the following: “As demonstrated by a series of recent publications [35, 37, 39-41, 45-66]}[67] and summarized in two comprehensive review papers [68, 69], to develop a really useful predictor for a biological system, one needs to follow Chou’s 5-steps rule to go through the following five steps: (1) select or construct a valid benchmark dataset to train and test the predictor; (2) represent the samples with an effective formulation that can truly reflect their intrinsic correlation with the target to be predicted; (3) introduce or develop a powerful algorithm to conduct the prediction; (4) properly perform cross-validation tests to objectively evaluate the anticipated prediction accuracy; (5) establish a user-friendly web-server for the predictor that is accessible to the public. Papers presented for developing a new sequence-analyzing method or statistical predictor by observing the guidelines of Chou’s 5-step rules have the following notable merits: (1) crystal clear in logic development, (2) completely transparent in operation, (3) easily to repeat the reported results by other investigators, (4) with high potential in stimulating other sequence-analyzing methods, and (5) very convenient to be used by the majority of experimental scientists.” Below, let us elaborate how to deal with these five steps. Also, the authors can refer the readership to an insightful Wikipedia article by clicking the link https://en.wikipedia.org/wiki/5-step_rules.
(3) The title of this paper sounds clumsy. To make it more consistent and harmonic with the above suggestion, it should be accordingly changed to: “Drug toxicity prediction by transcriptomic approach via the Chou’s 5-steps rule”, which is much more accurate, attractive, and stimulating as well.
(4) One of the cornerstones in this study is about feature extraction. But all the features extracted in this paper can be covered by a very powerful web-server called “Pse-in-One” [70] and its updated version “Pse-in-One2.0”, as clearly elucidated very recently [71]. Therefore, to provide the readership with an updated background about using feature extraction to conduct sequence analysis, the authors should in the relevant context add a prelude such as: “With the explosive growth of biological sequences in the post-genomic era, one of the most important but also most difficult problems in computational biology is how to express a biological sequence with a discrete model or a vector, yet still keep considerable sequence-order information or key pattern characteristic. This is because all the existing machine-learning algorithms (such as “Optimization” algorithm [72], “Covariance Discriminant” or “CD” algorithm [73, 74], “Nearest Neighbor” or “NN” algorithm [75], and “Support Vector Machine” or “SVM” algorithm [75, 76]) can only handle vectors as elaborated in a comprehensive review [44]. However, a vector defined in a discrete model may completely lose all the sequence-pattern information. To avoid completely losing the sequence-pattern information for proteins, the pseudo amino acid composition [42] or PseAAC [77] was proposed. Ever since the concept of Chou’s PseAAC was proposed, it has been widely used in nearly all the areas of computational proteomics (see, e.g., [78-81] [82-89] as well as a long list of references cited in [90]). Because it has been widely and increasingly used, four powerful open access soft-wares, called ‘PseAAC’ [91], ‘PseAAC-Builder’ [92], ‘propy’ [93], and ‘PseAAC-General’ [94], were established: the former three are for generating various modes of Chou’s special PseAAC [95]; while the 4th one for those of Chou’s general PseAAC [68], including not only all the special modes of feature vectors for proteins but also the higher level feature vectors such as “Functional Domain” mode (see Eqs.9-10 of [68]), “Gene Ontology” mode (see Eqs.11-12 of [68]), and “Sequential Evolution” or “PSSM” mode (see Eqs.13-14 of [68]). Encouraged by the successes of using PseAAC to deal with protein/peptide sequences, the concept of PseKNC (Pseudo K-tuple Nucleotide Composition) [43] was developed for generating various feature vectors for DNA/RNA sequences [96-98] that have proved very useful as well. Particularly, recently a very powerful web-server called ‘Pse-in-One’ [70] and its updated version ‘Pse-in-One2.0’ [71] have been established that can be used to generate any desired feature vectors for protein/peptide and DNA/RNA sequences according to the need of users’ studies”. This further indicates the necessity to change the paper’s title as pointed out in Comment 2
(5) It would be highly appreciated if the authors could provide a web-server to display their findings in a flexible way; i.e., by the web-server, users can manipulate to display the details as desired. It would certainly be very useful for drug design. If the author couldn’t do that now, as a compromise to attract the readership to the author’s future work and to the Journal as well, the author should add a statement in the end of the MS, such as: “As pointed out in [99], user-friendly and publicly accessible web-servers represent the future direction for reporting various important computational analyses and findings (see, e.g., [52, 55, 62, 65-67, 100-115]). Actually, they have significantly enhance the impacts of computational biology on medical science [44], driving medical science into an unprecedented revolution [90]. In my future work I shall strive to establish a web-server for the findings presented in this paper.”
REFEREANCES
[1] J.J. Chou, H. Matsuo, H. Duan, G. Wagner, Solution structure of the RAIDD CARD and model for CARD/CARD interaction in caspase-2 and caspase-9 recruitment. Cell 94 (1998) 171-180.
[2] K. Oxenoid, Y.S. Dong, C. Cao, T. Cui, Y. Sancak, A.L. Markhard, Z. Grabarek, L. Kong, Z. Liu, B. Ouyang, Y. Cong, V.K. Mootha, J.J. Chou, Architecture of the Mitochondrial Calcium Uniporter. Nature 533 (2016) 269-273.
[3] J. Dev, D. Park, Q. Fu, J. Chen, H.J. Ha, F. Ghantous, T. Herrmann, W. Chang, Z. Liu, G. Frey, M.S. Seaman, B. Chen, J.J. Chou, Structural Basis for Membrane Anchoring of HIV-1 Envelope Spike. Science 353 (2016) 172-175.
[4] J.R. Schnell, J.J. Chou, Structure and mechanism of the M2 proton channel of influenza A virus. Nature 451 (2008) 591-595.
[5] M.J. Berardi, W.M. Shih, S.C. Harrison, J.J. Chou, Mitochondrial uncoupling protein 2 structure determined by NMR molecular fragment searching. Nature 476 (2011) 109-13.
[6] B. OuYang, S. Xie, M.J. Berardi, X.M. Zhao, J. Dev, W. Yu, B. Sun, J.J. Chou, Unusual architecture of the p7 channel from hepatitis C virus. Nature 498 (2013) 521-525.
[7] J. Wang, R.M. Pielak, M.A. McClintock, J.J. Chou, Solution structure and functional analysis of the influenza B proton channel. Nature Structural and Molecular Biology 16 (2009) 1267-71.
[8] Q. Fu, T.M. Fu, A.C. Cruz, P. Sengupta, S.K. Thomas, S. Wang, R.M. Siegel, H. Wu, J.J. Chou, Structural Basis and Functional Role of Intramembrane Trimerization of the Fas/CD95 Death Receptor. Molecular Cell 61 (2016) 602-13.
[9] J.J. Chou, H. Li, G.S. Salvessen, J. Yuan, G. Wagner, Solution structure of BID, an intracellular amplifier of apoptotic signalling. Cell 96 (1999) 615-624.
[10] J.J. Chou, S. Li, C.B. Klee, A. Bax, Solution structure of Ca2+-calmodulin reveals flexible hand-like properties of its domains. Nature Structural Biology 8 (2001) 990-997.
[11] K. Oxenoid, J.J. Chou, The structure of phospholamban pentamer reveals a channel-like architecture in membranes. Proc Natl Acad Sci U S A 102 (2005) 10870-10875.
[12] M.E. Call, J.R. Schnell, C. Xu, R.A. Lutz, J.J. Chou, K.W. Wucherpfennig, The structure of the zetazeta transmembrane dimer reveals features essential for its assembly with the T cell receptor. Cell 127 (2006) 355-68.
[13] M.E. Call, K.W. Wucherpfennig, J.J. Chou, The structural basis for intramembrane assembly of an activating immunoreceptor complex. Nature Immunology 11 (2010) 1023-1029.
[14] E. Gagnon, C. Xu, W. Yang, H.H. Chu, M.E. Call, J.J. Chou, K.W. Wucherpfennig, Response multilayered control of T cell receptor phosphorylation. Cell 142 (2010) 669-671.
[15] S. Bruschweiler, Q. Yang, C. Run, J.J. Chou, Substrate-modulated ADP/ATP-transporter dynamics revealed by NMR relaxation dispersion. Nat Struct Mol Biol 22 (2015) 636-641.
[16] C. Cao, S. Wang, T. Cui, X.C. Su, J.J. Chou, Ion and inhibitor binding of the double-ring ion selectivity filter of the mitochondrial calcium uniporter. Proc Natl Acad Sci U S A 114 (2017) E2846-E2851.
[17] A. Piai, J. Dev, Q. Fu, J.J. Chou, Stability and Water Accessibility of the Trimeric Membrane Anchors of the HIV-1 Envelope Spikes. J Am Chem Soc 139 (2017) 18432-18435.
[18] L. Pan, T.M. Fu, W. Zhao, L. Zhao, W. Chen, C. Qiu, W. Liu, Z. Liu, A. Piai, Q. Fu, S. Chen, H. Wu, J.J. Chou, Higher-Order Clustering of the Transmembrane Anchor of DR5 Drives Signaling. Cell 176 (2019) 1477-1489 e14.
[19] K.C. Chou, A.G. Tomasselli, R.L. Heinrikson, Prediction of the Tertiary Structure of a Caspase-9/Inhibitor Complex. FEBS Letters 470 (2000) 249-256.
[20] K.C. Chou, D. Jones, R.L. Heinrikson, Prediction of the tertiary structure and substrate binding site of caspase-8. FEBS Letters 419 (1997) 49-54.
[21] K.C. Chou, Insights from modelling the 3D structure of the extracellular domain of alpha7 nicotinic acetylcholine receptor. Biochemical and Biophysical Research Communication (BBRC) 319 (2004) 433-438.
[22] K.C. Chou, Coupling interaction between thromboxane A2 receptor and alpha-13 subunit of guanine nucleotide-binding protein. Journal of Proteome Research 4 (2005) 1681-1686.
[23] K.C. Chou, W.J. Howe, Prediction of the tertiary structure of the beta-secretase zymogen. Biochem. Biophys. Res. Commun (BBRC) 292 (2002) 702-708.
[24] K.C. Chou, Insights from modelling the tertiary structure of BACE2. Journal of Proteome Research 3 (2004) 1069-1072.
[25] K.C. Chou, Insights from modelling three-dimensional structures of the human potassium and sodium channels. Journal of Proteome Research 3 (2004) 856-861.
[26] K.C. Chou, Modeling the tertiary structure of human cathepsin-E. Biochem. Biophys. Res. Commun. (BBRC) 331 (2005) 56-60.
[27] K.C. Chou, Insights from modeling the 3D structure of DNA-CBF3b complex. Journal of Proteome Research 4 (2005) 1657-1660.
[28] S.Q. Wang, Q.S. Du, Study of drug resistance of chicken influenza A virus (H5N1) from homology-modeled 3D structures of neuraminidases. Biochem Biophys Res Comm (BBRC) 354 (2007) 634-640.
[29] S.Q. Wang, Q.S. Du, R.B. Huang, D.W. Zhang, Insights from investigating the interaction of oseltamivir (Tamiflu) with neuraminidase of the 2009 H1N1 swine flu virus. Biochemical and Biophysical Research Communications (BBRC) 386 (2009) 432-436.
[30] X.B. Li, S.Q. Wang, W.R. Xu, R.L. Wang, Novel Inhibitor Design for Hemagglutinin against H1N1 Influenza Virus by Core Hopping Method. PLoS One 6 (2011) e28111.
[31] Y. Ma, S.Q. Wang, W.R. Xu, R.L. Wang, Design novel dual agonists for treating type-2 diabetes by targeting peroxisome proliferator-activated receptors with core hopping approach. PLoS One 7 (2012) e38546.
[32] Y.D. Khan, N. Rasool, W. Hussain, S.A. Khan, iPhosT-PseAAC: Identify phosphothreonine sites by incorporating sequence statistical moments into PseAAC. Analytical Biochemistry 550 (2018) 109-116.
[33] Y.D. Khan, N. Rasool, W. Hussain, S.A. Khan, iPhosY-PseAAC: identify phosphotyrosine sites by incorporating sequence statistical moments into PseAAC. Mol Biol Rep 10.1007/s11033-018-4417-z (2018).
[34] M.F. Sabooh, N. Iqbal, M. Khan, M. Khan, H.F. Maqbool, Identifying 5-methylcytosine sites in RNA sequence using composite encoding feature into Chou's PseKNC. J Theor Biol 452 (2018) 1-9.
[35] W. Hussain, S.D. Khan, N. Rasool, S.A. Khan, SPalmitoylC-PseAAC: A sequence-based model developed via Chou's 5-steps rule and general PseAAC for identifying S-palmitoylation sites in proteins. Anal Biochem 568 (2019) 14-23.
[36] V.K. Shyamili, A. Vellaichamy, Sequence and structure-based characterization of human and yeast ubiquitination sites by using Chou’s sample formulation. Proteins: Structure, Function and Bioinformatics doi:10.1002/prot.25689 (2019).
[37] X. Xiao, J.L. Min, W.Z. Lin, Z. Liu, X. Cheng, iDrug-Target: predicting the interactions between drug compounds and target proteins in cellular networking via the benchmark dataset optimization approach. J Biomol Struct Dyn (JBSD) 33 (2015) 2221-2233.
[38] J. Jia, Z. Liu, X. Xiao, iPPI-Esml: an ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into PseAAC. J Theor Biol 377 (2015) 47-56.
[39] Z. Liu, X. Xiao, W.R. Qiu, iDNA-Methyl: Identifying DNA methylation sites via pseudo trinucleotide composition. Analytical Biochemistry 474 (2015) 69-77.
[40] W. Chen, P.M. Feng, H. Lin, iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition Nucleic Acids Research 41 (2013) e68.
[41] H. Lin, E.Z. Deng, H. Ding, W. Chen, iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Research 42 (2014) 12961-12972.
[42] K.C. Chou, Prediction of protein cellular attributes using pseudo amino acid composition. PROTEINS: Structure, Function, and Genetics (Erratum: ibid., 2001, Vol.44, 60) 43 (2001) 246-255.
[43] W. Chen, T.Y. Lei, D.C. Jin, H. Lin, PseKNC: a flexible web-server for generating pseudo K-tuple nucleotide composition. Analytical Biochemistry 456 (2014) 53-60.
[44] K.C. Chou, Impacts of bioinformatics to medicinal chemistry. Medicinal Chemistry 11 (2015) 218-234.
[45] P.M. Feng, W. Chen, H. Lin, iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Analytical Biochemistry 442 (2013) 118-25.
[46] W. Chen, P.M. Feng, E.Z. Deng, H. Lin, iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. Analytical Biochemistry 462 (2014) 76-83.
[47] H. Ding, E.Z. Deng, L.F. Yuan, L. Liu, H. Lin, W. Chen, iCTX-Type: A sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Research International (BMRI) 2014 (2014) 286419.
[48] B. Liu, L. Fang, S. Wang, X. Wang, H. Li, Identification of microRNA precursor with the degenerate K-tuple or Kmer strategy. Journal of Theoretical Biology 385 (2015) 153-159.
[49] J. Jia, Z. Liu, X. Xiao, B. Liu, iSuc-PseOpt: Identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset. Anal Biochem 497 (2016) 48-56.
[50] J. Jia, L. Zhang, Z. Liu, X. Xiao, pSumo-CD: Predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general PseAAC. Bioinformatics 32 (2016) 3133-3141.
[51] B. Liu, L. Fang, R. Long, X. Lan, iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition. Bioinformatics 32 (2016) 362-369.
[52] W. Chen, P. Feng, H. Yang, H. Ding, H. Lin, iRNA-AI: identifying the adenosine to inosine editing sites in RNA sequences. Oncotarget 8 (2017) 4208-4217.
[53] W. Chen, H. Ding, X. Zhou, H. Lin, iRNA(m6A)-PseDNC: Identifying N6-methyladenosine sites using pseudo dinucleotide composition. Analytical Biochemistry 561-562 (2018) 59-65.
[54] W. Chen, P. Feng, H. Yang, H. Ding, H. Lin, iRNA-3typeA: identifying 3-types of modification at RNA’s adenosine sites. Molecular Therapy: Nucleic Acid 11 (2018) 468-474.
[55] W.R. Qiu, B.Q. Sun, X. Xiao, Z.C. Xu, J.H. Jia, iKcr-PseEns: Identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics 110 (2018) 239-246.
[56] P. Feng, H. Yang, H. Ding, H. Lin, W. Chen, iDNA6mA-PseKNC: Identifying DNA N(6)-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 111 (2019) 96-102.
[57] W. Hussain, Y.D. Khan, N. Rasool, S.A. Khan, SPrenylC-PseAAC: A sequence-based model developed via Chou's 5-steps rule and general PseAAC for identifying S-prenylation sites in proteins. J Theor Biol 468 (2019) 1-11.
[58] J. Jia, X. Li, W. Qiu, X. Xiao, iPPI-PseAAC(CGR): Identify protein-protein interactions by incorporating chaos game representation into PseAAC. Journal of Theoretical Biology 460 (2019) 195-203.
[59] Y.D. Khan, M. Jamil, W. Hussain, N. Rasool, S.A. Khan, pSSbond-PseAAC: Prediction of disulfide bonding sites by integration of PseAAC and statistical moments. J Theor Biol 463 (2019) 47-55.
[60] Y. Lu, S. Wang, J. Wang, G. Zhou, Q. Zhang, X. Zhou, B. Niu, Q. Chen, An Epidemic Avian Influenza Prediction Model Based on Google Trends. Letters in Organic Chemistry 16 (2019) 303-310.
[61] Y.D. Khan, A. Batool, N. Rasool, A. Khan, Prediction of nitrosocysteine sites using position and composition variant features. Letters in Organic Chemistry 16 (2019) 283-293.
[62] X. Cheng, X. Xiao, pLoc_bal-mPlant: predict subcellular localization of plant proteins by general PseAAC and balancing training dataset Curr Pharm Des 24 (2018) 4013-4022.
[63] J.X. Li, S.Q. Wang, Q.S. Du, H. Wei, X.M. Li, J.Z. Meng, Q.Y. Wang, N.Z. Xie, R.B. Huang, Simulated protein thermal detection (SPTD) for enzyme thermostability study and an application example for pullulanase from Bacillus deramificans. Curr Pharm Des 24 (2018) 4023-4033.
[64] A.W. Ghauri, Y.D. Khan, N. Rasool, S.A. Khan, pNitro-Tyr-PseAAC: Predict nitrotyrosine sites in proteins by incorporating five features into Chou's general PseAAC. Curr Pharm Des 24 (2018) 4034-4043.
[65] K.C. Chou, X. Cheng, X. Xiao, pLoc_bal-mEuk: predict subcellular localization of eukaryotic proteins by general PseAAC and quasi-balancing training dataset. Med Chem 15 (2019) 472-485.
[66] X. Xiao, X. Cheng, G. Chen, Q. Mao, pLoc_bal-mGpos: predict subcellular localization of Gram-positive bacterial proteins by quasi-balancing training dataset and PseAAC. Genomics doi:10.1016/j.ygeno.2018.05.017 (2018).
[67] X. Xiao, X. Cheng, G. Chen, Q. Mao, pLoc_bal-mVirus: predict subcellular localization of multi-label virus proteins by PseAAC and IHTS treatment to balance training dataset. Med Chem 15 (2019) 496-509.
[68] K.C. Chou, Some remarks on protein attribute prediction and pseudo amino acid composition (50th Anniversary Year Review, 5-steps rule). Journal of Theoretical Biology 273 (2011) 236-247.
[69] K.C. Chou, Advance in predicting subcellular localization of multi-label proteins and its implication for developing multi-target drugs. Current Medicinal Chemistry doi: 10.2174/0929867326666190507082559 (2019).
[70] B. Liu, F. Liu, X. Wang, J. Chen, L. Fang, Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Research 43 (2015) W65-W71.
[71] B. Liu, H. Wu, Pse-in-One 2.0: An improved package of web servers for generating various modes of pseudo components of DNA, RNA, and protein sequences. Natural Science 9 (2017) 67-91.
[72] C.T. Zhang, An optimization approach to predicting protein structural class from amino acid composition. Protein Science 1 (1992) 401-408.
[73] K.C. Chou, D.W. Elrod, Bioinformatical analysis of G-protein-coupled receptors. Journal of Proteome Research 1 (2002) 429-433.
[74] K.C. Chou, Y.D. Cai, Prediction and classification of protein subcellular location: sequence-order effect and pseudo amino acid composition. Journal of Cellular Biochemistry (Addendum, ibid. 2004, 91, 1085) 90 (2003) 1250-1260.
[75] L. Hu, T. Huang, X. Shi, W.C. Lu, Y.D. Cai, Predicting functions of proteins in mouse based on weighted protein-protein interaction network and protein hybrid properties PLoS ONE 6 (2011) e14556.
[76] Y.D. Cai, K.Y. Feng, W.C. Lu, Using LogitBoost classifier to predict protein structural classes. Journal of Theoretical Biology 238 (2006) 172-176.
[77] K.C. Chou, Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21 (2005) 10-19.
[78] A. Dehzangi, R. Heffernan, A. Sharma, J. Lyons, K. Paliwal, A. Sattar, Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou's general PseAAC. J Theor Biol 364 (2015) 284-294.
[79] M. Behbahani, H. Mohabatkar, M. Nosrati, Analysis and comparison of lignin peroxidases between fungi and bacteria using three different modes of Chou's general pseudo amino acid composition. J Theor Biol 411 (2016) 1-5.
[80] M. Kabir, M. Hayat, iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou's PseAAC to formulate DNA samples. Molecular Genetics and Genomics 291 (2016) 285-96.
[81] P.K. Meher, T.K. Sahu, V. Saini, A.R. Rao, Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou's general PseAAC. Sci Rep 7 (2017) 42362.
[82] Z. Ju, J.J. He, Prediction of lysine propionylation sites using biased SVM and incorporating four different sequence features into Chou's PseAAC. J Mol Graph Model 76 (2017) 356-363.
[83] B. Yu, S. Li, W.Y. Qiu, C. Chen, R.X. Chen, L. Wang, M.H. Wang, Y. Zhang, Accurate prediction of subcellular location of apoptosis proteins combining Chou's PseAAC and PsePSSM based on wavelet denoising. Oncotarget 8 (2017) 107640-107665.
[84] J. Ahmad, M. Hayat, MFSC: Multi-voting based Feature Selection for Classification of Golgi Proteins by Adopting the General form of Chou's PseAAC components. J Theor Biol 463 (2018) 99-109.
[85] S. Akbar, M. Hayat, iMethyl-STTNC: Identification of N(6)-methyladenosine sites by extending the Idea of SAAC into Chou's PseAAC to formulate RNA sequences. J Theor Biol 455 (2018) 205-211.
[86] E. Contreras-Torres, Predicting structural classes of proteins by incorporating their global and local physicochemical and conformational properties into general Chou's PseAAC. J Theor Biol 454 (2018) 139-145.
[87] S. Zhang, Y. Liang, Predicting apoptosis protein subcellular localization by integrating auto-cross correlation and PSSM into Chou's PseAAC. J Theor Biol 457 (2018) 163-169.
[88] J. Ahmad, M. Hayat, MFSC: Multi-voting based feature selection for classification of Golgi proteins by adopting the general form of Chou's PseAAC components. J Theor Biol 463 (2019) 99-109.
[89] M. Tahir, M. Hayat, S.A. Khan, iNuc-ext-PseTNC: an efficient ensemble model for identification of nucleosome positioning by extending the concept of Chou's PseAAC to pseudo-tri-nucleotide composition. Mol Genet Genomics 294 (2019) 199-210.
[90] K.C. Chou, An unprecedented revolution in medicinal chemistry driven by the progress of biological science. Current Topics in Medicinal Chemistry 17 (2017) 2337-2358.
[91] H.B. Shen, PseAAC: a flexible web-server for generating various kinds of protein pseudo amino acid composition. Analytical Biochemistry 373 (2008) 386-388.
[92] P. Du, X. Wang, C. Xu, Y. Gao, PseAAC-Builder: A cross-platform stand-alone program for generating various special Chou's pseudo amino acid compositions. Analytical Biochemistry 425 (2012) 117-119.
[93] D.S. Cao, Q.S. Xu, Y.Z. Liang, propy: a tool to generate various modes of Chou's PseAAC. Bioinformatics 29 (2013) 960-962.
[94] P. Du, S. Gu, Y. Jiao, PseAAC-General: Fast building various modes of general form of Chou's pseudo amino acid composition for large-scale protein datasets. International Journal of Molecular Sciences 15 (2014) 3495-3506.
[95] K.C. Chou, Pseudo amino acid composition and its applications in bioinformatics, proteomics and system biology. Current Proteomics 6 (2009) 262-274.
[96] W. Chen, H. Lin, Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. Mol BioSyst 11 (2015) 2620-2634.
[97] B. Liu, F. Yang, D.S. Huang, iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC. Bioinformatics 34 (2018) 33-40.
[98] M. Tahir, H. Tayara, K.T. Chong, iRNA-PseKNC(2methyl): Identify RNA 2'-O-methylation sites by convolution neural network and Chou's pseudo components. J Theor Biol 465 (2019) 1-6.
[99] K.C. Chou, H.B. Shen, Recent advances in developing web-servers for predicting protein attributes. Natural Science 1 (2009) 63-92
[100] X. Cheng, X. Xiao, pLoc-mPlant: predict subcellular localization of multi-location plant proteins via incorporating the optimal GO information into general PseAAC. Molecular BioSystems 13 (2017) 1722-1727.
[101] X. Cheng, X. Xiao, pLoc-mVirus: predict subcellular localization of multi-location virus proteins via incorporating the optimal GO information into general PseAAC. Gene (Erratum: ibid., 2018, Vol.644, 156-156) 628 (2017) 315-321.
[102] X. Cheng, X. Xiao, pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics 110 (2018) 50-58.
[103] X. Cheng, X. Xiao, pLoc-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general PseAAC. Genomics 110 (2018) 231-239.
[104] X. Cheng, S.G. Zhao, W.Z. Lin, X. Xiao, pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites. Bioinformatics 33 (2017) 3524-3531.
[105] X. Xiao, X. Cheng, S. Su, Q. Nao, pLoc-mGpos: Incorporate key gene ontology information into general PseAAC for predicting subcellular localization of Gram-positive bacterial proteins. Natural Science 9 (2017) 331-349.
[106] X. Cheng, X. Xiao, pLoc-mHum: predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information. Bioinformatics 34 (2018) 1448-1456.
[107] X. Cheng, S.G. Zhao, X. Xiao, iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals. Bioinformatics (Corrigendum, ibid., 2017, Vol.33, 2610) 33 (2017) 341-346.
[108] P. Feng, H. Ding, H. Yang, W. Chen, H. Lin, iRNA-PseColl: Identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into PseKNC. Molecular Therapy - Nucleic Acids 7 (2017) 155-163.
[109] B. Liu, S. Wang, R. Long, iRSpot-EL: identify recombination spots with an ensemble learning approach. Bioinformatics 33 (2017) 35-41.
[110] B. Liu, F. Yang, 2L-piRNA: A two-layer ensemble classifier for identifying piwi-interacting RNAs and their function. Molecular Therapy - Nucleic Acids 7 (2017) 267-277.
[111] W.R. Qiu, S.Y. Jiang, Z.C. Xu, X. Xiao, iRNAm5C-PseDNC: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition. Oncotarget 8 (2017) 41178-41188.
[112] W.R. Qiu, B.Q. Sun, X. Xiao, D. Xu, iPhos-PseEvo: Identifying human phosphorylated proteins by incorporating evolutionary information into general PseAAC via grey system theory. Molecular Informatics 36 (2017) UNSP 1600010.
[113] X. Cheng, W.Z. Lin, X. Xiao, pLoc_bal-mAnimal: predict subcellular localization of animal proteins by balancing training dataset and PseAAC. Bioinformatics 35 (2019) 398-406.
[114] X. Cheng, X. Xiao, pLoc_bal-mGneg: predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC. Journal of Theoretical Biology 458 (2018) 92-102.
[115] K.C. Chou, X. Cheng, X. Xiao, pLoc_bal-mHum: predict subcellular localization of human proteins by PseAAC and quasi-balancing training dataset Genomics doi:10.1016/j.ygeno.2018.08.007 (2018).
名噪一时的学术骗子
ShiMaQian (2025-12-30 10:03:15) 评论 (14)君子有所为,有所不为。但科学界近年来小人频出,所作所为不择手段,触目惊心。其中 “生物学家”周国城(Kuo-Chen Chou)的骗技, 端的是炉火纯青,登峰造极,无人能出其右。其“成就”蜚声中外,名噪一时,长年在科学高引率名单上霸榜,在国内各大学和科研机构混得风生水起。
但鲁迅却说,骗子有术,也有效,但也有限。
2019年,Bioinformatics(生物信息学杂志)开始调查周国城的骗术,接着Journal of Theoretical Biology(理论生物学杂志)也开始调查,2020年Nature(自然杂志)刊登专文揭露周国城的骗术。中国媒介也有所转载。
然而很多科学杂志投鼠忌器,不肯详述细节,往往语焉不详,所以本文是首次公布这些细节。因为周国城乃是数十年的惯犯,劣迹斑斑,每次作案,文本都很长,难以一一列举,所以本文只提供一个有代表性的特列来深入剖析。
下面的“Reviewer #2”便是周国城,要求作者引用他和他儿子的一百多篇文章。杂志名改成了"****",免得影响杂志的名声。
周国城怕作者嫌麻烦,所以就越厨代庖,给作者写了五段话,把这一百多篇文章都引用了。这五段话,其实都是胡说八道,但只要作者在文章中插入这五段,周国城就会强烈推荐这篇文章。
这一百多篇文章都是谁写的?读者可能会发现第1到第18篇文章的作者不是周国城(K. C. Chou),而是J. J. Chou。这J. J. Chou何许人也?却原来是周国城的儿子。其爱子之心,却也情有可原。
引用了J. J. Chou的文章后,便轮到周国城(K. C. Chou)自己了。我把Chou字用红色标记,方便读者查找。
眼尖的读者也许会发现有的文献中Chou并没有出现在作者之中,如引用文献列表中28-33,37-41,43,45-56,58-63,66-67,70-72,等等。我原来也以为周国城想掩人耳目,也列了一些别人的文章。没想到所有这些文章,竟也都是周国城的文章。他只是把他自己的名字给隐去了。这些文章的引用率,最终都会归于他的名下,与他在文献列表中是否列出自己的名字并无关系。
周国城不光要求作者引用他的文章,而且要求作者把他的名字加到文章的题目中去,以增加他的知名度。有时候周国城甚至会要求作者把他的名字也加入到文章的作者之中。
科学工作者辛辛苦苦发表一篇文章,通常也不过十几或几十次被引。这周国城审稿一次,便可增加一百多次的引用,所以在高被引榜单上霸榜多年。
周国城的骗术,为什么能成功?如果杂志的编辑会看文章或审稿意见的话,马上就会发现周国城的伎俩,然而如今的科学杂志的编辑大多是滥竽充数,审稿人更是形同虚设。于是小人辈出,骗术横行。学术界乌烟瘴气,触目惊心。
君子有所为,有所不为。但小人无所不为。
周国城已于数年前去世。惟愿一个周国城倒下去,不会有千万个周国城站起来。
---
Reviewer #2:
This is a very interesting paper and hence holds very high potential for publication. But to meet the increasingly high quality standard of ****, a compulsory major revision is absolutely needed according to the following points.
(1) For quite a long period of time, the information derived by the computational approaches (or “in silico” methods) were not trusted very much by most experimental scientists due to the notorious local minimum problem. Actually, they only trusted the results determined by the experiments, and thought computational results were not reliable. The situation has been changed during the last decade or so since the rapid development of structural bioinformatics and sequential bioinformatics. To justify their computational structures, the authors should take the advantage of adding a prelude, such as: “The knowledge of protein 3D (three-dimensional) structures or their complexes with ligands is vitally important for rational drug design. Although X-ray crystallography is a powerful tool in determining these structures, it is time-consuming and expensive, and not all proteins can be successfully crystallized. Membrane proteins are difficult to crystallize and most of them will not dissolve in normal solvents. Therefore, so far very few membrane protein structures have been determined. NMR is indeed a very powerful tool in determining the 3D structures of membrane proteins (see, e.g., [1-4, 5{Chou, 2001 #569, 6-18]), but it is also time-consuming and costly. To acquire the structural information in a timely manner, a series of 3D protein structures have been developed by means of structural bioinformatics tools (see, e.g., [19-31]). Meanwhile, facing the explosive growth of biological sequences discovered in the post-genomic age, to timely use them for drug development, a lot of important sequence-based information, such as PTM (posttranslational modification) sites in proteins [32-36], protein-drug interaction in cellular networking [37], protein-protein interactions [38], DNA-methylation sites [39], recombination spots [40], and sigma-54 promoters [41], have been deducted by various sequential bioinformatics tools such as PseAAC approach [42] and PseKNC approach [43]. Actually, the rapid development in sequential bioinformatics and structural bioinformatics have driven the medicinal chemistry undergoing an unprecedented revolution [44], in which the computational biology has played increasingly important roles in stimulating the development of finding novel drugs. In view of this, the computational (or in silico) methods were also utilized in this study .” Adding the above discussion and citing the references therein will greatly help justify the author’s computational approach.
(2) To make the structure of this paper clearer and easier for readers to follow, the authors should in the end of the Introduction (or right before the beginning of describing their own method) add the following: “As demonstrated by a series of recent publications [35, 37, 39-41, 45-66]}[67] and summarized in two comprehensive review papers [68, 69], to develop a really useful predictor for a biological system, one needs to follow Chou’s 5-steps rule to go through the following five steps: (1) select or construct a valid benchmark dataset to train and test the predictor; (2) represent the samples with an effective formulation that can truly reflect their intrinsic correlation with the target to be predicted; (3) introduce or develop a powerful algorithm to conduct the prediction; (4) properly perform cross-validation tests to objectively evaluate the anticipated prediction accuracy; (5) establish a user-friendly web-server for the predictor that is accessible to the public. Papers presented for developing a new sequence-analyzing method or statistical predictor by observing the guidelines of Chou’s 5-step rules have the following notable merits: (1) crystal clear in logic development, (2) completely transparent in operation, (3) easily to repeat the reported results by other investigators, (4) with high potential in stimulating other sequence-analyzing methods, and (5) very convenient to be used by the majority of experimental scientists.” Below, let us elaborate how to deal with these five steps. Also, the authors can refer the readership to an insightful Wikipedia article by clicking the link https://en.wikipedia.org/wiki/5-step_rules.
(3) The title of this paper sounds clumsy. To make it more consistent and harmonic with the above suggestion, it should be accordingly changed to: “Drug toxicity prediction by transcriptomic approach via the Chou’s 5-steps rule”, which is much more accurate, attractive, and stimulating as well.
(4) One of the cornerstones in this study is about feature extraction. But all the features extracted in this paper can be covered by a very powerful web-server called “Pse-in-One” [70] and its updated version “Pse-in-One2.0”, as clearly elucidated very recently [71]. Therefore, to provide the readership with an updated background about using feature extraction to conduct sequence analysis, the authors should in the relevant context add a prelude such as: “With the explosive growth of biological sequences in the post-genomic era, one of the most important but also most difficult problems in computational biology is how to express a biological sequence with a discrete model or a vector, yet still keep considerable sequence-order information or key pattern characteristic. This is because all the existing machine-learning algorithms (such as “Optimization” algorithm [72], “Covariance Discriminant” or “CD” algorithm [73, 74], “Nearest Neighbor” or “NN” algorithm [75], and “Support Vector Machine” or “SVM” algorithm [75, 76]) can only handle vectors as elaborated in a comprehensive review [44]. However, a vector defined in a discrete model may completely lose all the sequence-pattern information. To avoid completely losing the sequence-pattern information for proteins, the pseudo amino acid composition [42] or PseAAC [77] was proposed. Ever since the concept of Chou’s PseAAC was proposed, it has been widely used in nearly all the areas of computational proteomics (see, e.g., [78-81] [82-89] as well as a long list of references cited in [90]). Because it has been widely and increasingly used, four powerful open access soft-wares, called ‘PseAAC’ [91], ‘PseAAC-Builder’ [92], ‘propy’ [93], and ‘PseAAC-General’ [94], were established: the former three are for generating various modes of Chou’s special PseAAC [95]; while the 4th one for those of Chou’s general PseAAC [68], including not only all the special modes of feature vectors for proteins but also the higher level feature vectors such as “Functional Domain” mode (see Eqs.9-10 of [68]), “Gene Ontology” mode (see Eqs.11-12 of [68]), and “Sequential Evolution” or “PSSM” mode (see Eqs.13-14 of [68]). Encouraged by the successes of using PseAAC to deal with protein/peptide sequences, the concept of PseKNC (Pseudo K-tuple Nucleotide Composition) [43] was developed for generating various feature vectors for DNA/RNA sequences [96-98] that have proved very useful as well. Particularly, recently a very powerful web-server called ‘Pse-in-One’ [70] and its updated version ‘Pse-in-One2.0’ [71] have been established that can be used to generate any desired feature vectors for protein/peptide and DNA/RNA sequences according to the need of users’ studies”. This further indicates the necessity to change the paper’s title as pointed out in Comment 2
(5) It would be highly appreciated if the authors could provide a web-server to display their findings in a flexible way; i.e., by the web-server, users can manipulate to display the details as desired. It would certainly be very useful for drug design. If the author couldn’t do that now, as a compromise to attract the readership to the author’s future work and to the Journal as well, the author should add a statement in the end of the MS, such as: “As pointed out in [99], user-friendly and publicly accessible web-servers represent the future direction for reporting various important computational analyses and findings (see, e.g., [52, 55, 62, 65-67, 100-115]). Actually, they have significantly enhance the impacts of computational biology on medical science [44], driving medical science into an unprecedented revolution [90]. In my future work I shall strive to establish a web-server for the findings presented in this paper.”
REFEREANCES
[1] J.J. Chou, H. Matsuo, H. Duan, G. Wagner, Solution structure of the RAIDD CARD and model for CARD/CARD interaction in caspase-2 and caspase-9 recruitment. Cell 94 (1998) 171-180.
[2] K. Oxenoid, Y.S. Dong, C. Cao, T. Cui, Y. Sancak, A.L. Markhard, Z. Grabarek, L. Kong, Z. Liu, B. Ouyang, Y. Cong, V.K. Mootha, J.J. Chou, Architecture of the Mitochondrial Calcium Uniporter. Nature 533 (2016) 269-273.
[3] J. Dev, D. Park, Q. Fu, J. Chen, H.J. Ha, F. Ghantous, T. Herrmann, W. Chang, Z. Liu, G. Frey, M.S. Seaman, B. Chen, J.J. Chou, Structural Basis for Membrane Anchoring of HIV-1 Envelope Spike. Science 353 (2016) 172-175.
[4] J.R. Schnell, J.J. Chou, Structure and mechanism of the M2 proton channel of influenza A virus. Nature 451 (2008) 591-595.
[5] M.J. Berardi, W.M. Shih, S.C. Harrison, J.J. Chou, Mitochondrial uncoupling protein 2 structure determined by NMR molecular fragment searching. Nature 476 (2011) 109-13.
[6] B. OuYang, S. Xie, M.J. Berardi, X.M. Zhao, J. Dev, W. Yu, B. Sun, J.J. Chou, Unusual architecture of the p7 channel from hepatitis C virus. Nature 498 (2013) 521-525.
[7] J. Wang, R.M. Pielak, M.A. McClintock, J.J. Chou, Solution structure and functional analysis of the influenza B proton channel. Nature Structural and Molecular Biology 16 (2009) 1267-71.
[8] Q. Fu, T.M. Fu, A.C. Cruz, P. Sengupta, S.K. Thomas, S. Wang, R.M. Siegel, H. Wu, J.J. Chou, Structural Basis and Functional Role of Intramembrane Trimerization of the Fas/CD95 Death Receptor. Molecular Cell 61 (2016) 602-13.
[9] J.J. Chou, H. Li, G.S. Salvessen, J. Yuan, G. Wagner, Solution structure of BID, an intracellular amplifier of apoptotic signalling. Cell 96 (1999) 615-624.
[10] J.J. Chou, S. Li, C.B. Klee, A. Bax, Solution structure of Ca2+-calmodulin reveals flexible hand-like properties of its domains. Nature Structural Biology 8 (2001) 990-997.
[11] K. Oxenoid, J.J. Chou, The structure of phospholamban pentamer reveals a channel-like architecture in membranes. Proc Natl Acad Sci U S A 102 (2005) 10870-10875.
[12] M.E. Call, J.R. Schnell, C. Xu, R.A. Lutz, J.J. Chou, K.W. Wucherpfennig, The structure of the zetazeta transmembrane dimer reveals features essential for its assembly with the T cell receptor. Cell 127 (2006) 355-68.
[13] M.E. Call, K.W. Wucherpfennig, J.J. Chou, The structural basis for intramembrane assembly of an activating immunoreceptor complex. Nature Immunology 11 (2010) 1023-1029.
[14] E. Gagnon, C. Xu, W. Yang, H.H. Chu, M.E. Call, J.J. Chou, K.W. Wucherpfennig, Response multilayered control of T cell receptor phosphorylation. Cell 142 (2010) 669-671.
[15] S. Bruschweiler, Q. Yang, C. Run, J.J. Chou, Substrate-modulated ADP/ATP-transporter dynamics revealed by NMR relaxation dispersion. Nat Struct Mol Biol 22 (2015) 636-641.
[16] C. Cao, S. Wang, T. Cui, X.C. Su, J.J. Chou, Ion and inhibitor binding of the double-ring ion selectivity filter of the mitochondrial calcium uniporter. Proc Natl Acad Sci U S A 114 (2017) E2846-E2851.
[17] A. Piai, J. Dev, Q. Fu, J.J. Chou, Stability and Water Accessibility of the Trimeric Membrane Anchors of the HIV-1 Envelope Spikes. J Am Chem Soc 139 (2017) 18432-18435.
[18] L. Pan, T.M. Fu, W. Zhao, L. Zhao, W. Chen, C. Qiu, W. Liu, Z. Liu, A. Piai, Q. Fu, S. Chen, H. Wu, J.J. Chou, Higher-Order Clustering of the Transmembrane Anchor of DR5 Drives Signaling. Cell 176 (2019) 1477-1489 e14.
[19] K.C. Chou, A.G. Tomasselli, R.L. Heinrikson, Prediction of the Tertiary Structure of a Caspase-9/Inhibitor Complex. FEBS Letters 470 (2000) 249-256.
[20] K.C. Chou, D. Jones, R.L. Heinrikson, Prediction of the tertiary structure and substrate binding site of caspase-8. FEBS Letters 419 (1997) 49-54.
[21] K.C. Chou, Insights from modelling the 3D structure of the extracellular domain of alpha7 nicotinic acetylcholine receptor. Biochemical and Biophysical Research Communication (BBRC) 319 (2004) 433-438.
[22] K.C. Chou, Coupling interaction between thromboxane A2 receptor and alpha-13 subunit of guanine nucleotide-binding protein. Journal of Proteome Research 4 (2005) 1681-1686.
[23] K.C. Chou, W.J. Howe, Prediction of the tertiary structure of the beta-secretase zymogen. Biochem. Biophys. Res. Commun (BBRC) 292 (2002) 702-708.
[24] K.C. Chou, Insights from modelling the tertiary structure of BACE2. Journal of Proteome Research 3 (2004) 1069-1072.
[25] K.C. Chou, Insights from modelling three-dimensional structures of the human potassium and sodium channels. Journal of Proteome Research 3 (2004) 856-861.
[26] K.C. Chou, Modeling the tertiary structure of human cathepsin-E. Biochem. Biophys. Res. Commun. (BBRC) 331 (2005) 56-60.
[27] K.C. Chou, Insights from modeling the 3D structure of DNA-CBF3b complex. Journal of Proteome Research 4 (2005) 1657-1660.
[28] S.Q. Wang, Q.S. Du, Study of drug resistance of chicken influenza A virus (H5N1) from homology-modeled 3D structures of neuraminidases. Biochem Biophys Res Comm (BBRC) 354 (2007) 634-640.
[29] S.Q. Wang, Q.S. Du, R.B. Huang, D.W. Zhang, Insights from investigating the interaction of oseltamivir (Tamiflu) with neuraminidase of the 2009 H1N1 swine flu virus. Biochemical and Biophysical Research Communications (BBRC) 386 (2009) 432-436.
[30] X.B. Li, S.Q. Wang, W.R. Xu, R.L. Wang, Novel Inhibitor Design for Hemagglutinin against H1N1 Influenza Virus by Core Hopping Method. PLoS One 6 (2011) e28111.
[31] Y. Ma, S.Q. Wang, W.R. Xu, R.L. Wang, Design novel dual agonists for treating type-2 diabetes by targeting peroxisome proliferator-activated receptors with core hopping approach. PLoS One 7 (2012) e38546.
[32] Y.D. Khan, N. Rasool, W. Hussain, S.A. Khan, iPhosT-PseAAC: Identify phosphothreonine sites by incorporating sequence statistical moments into PseAAC. Analytical Biochemistry 550 (2018) 109-116.
[33] Y.D. Khan, N. Rasool, W. Hussain, S.A. Khan, iPhosY-PseAAC: identify phosphotyrosine sites by incorporating sequence statistical moments into PseAAC. Mol Biol Rep 10.1007/s11033-018-4417-z (2018).
[34] M.F. Sabooh, N. Iqbal, M. Khan, M. Khan, H.F. Maqbool, Identifying 5-methylcytosine sites in RNA sequence using composite encoding feature into Chou's PseKNC. J Theor Biol 452 (2018) 1-9.
[35] W. Hussain, S.D. Khan, N. Rasool, S.A. Khan, SPalmitoylC-PseAAC: A sequence-based model developed via Chou's 5-steps rule and general PseAAC for identifying S-palmitoylation sites in proteins. Anal Biochem 568 (2019) 14-23.
[36] V.K. Shyamili, A. Vellaichamy, Sequence and structure-based characterization of human and yeast ubiquitination sites by using Chou’s sample formulation. Proteins: Structure, Function and Bioinformatics doi:10.1002/prot.25689 (2019).
[37] X. Xiao, J.L. Min, W.Z. Lin, Z. Liu, X. Cheng, iDrug-Target: predicting the interactions between drug compounds and target proteins in cellular networking via the benchmark dataset optimization approach. J Biomol Struct Dyn (JBSD) 33 (2015) 2221-2233.
[38] J. Jia, Z. Liu, X. Xiao, iPPI-Esml: an ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into PseAAC. J Theor Biol 377 (2015) 47-56.
[39] Z. Liu, X. Xiao, W.R. Qiu, iDNA-Methyl: Identifying DNA methylation sites via pseudo trinucleotide composition. Analytical Biochemistry 474 (2015) 69-77.
[40] W. Chen, P.M. Feng, H. Lin, iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition Nucleic Acids Research 41 (2013) e68.
[41] H. Lin, E.Z. Deng, H. Ding, W. Chen, iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Research 42 (2014) 12961-12972.
[42] K.C. Chou, Prediction of protein cellular attributes using pseudo amino acid composition. PROTEINS: Structure, Function, and Genetics (Erratum: ibid., 2001, Vol.44, 60) 43 (2001) 246-255.
[43] W. Chen, T.Y. Lei, D.C. Jin, H. Lin, PseKNC: a flexible web-server for generating pseudo K-tuple nucleotide composition. Analytical Biochemistry 456 (2014) 53-60.
[44] K.C. Chou, Impacts of bioinformatics to medicinal chemistry. Medicinal Chemistry 11 (2015) 218-234.
[45] P.M. Feng, W. Chen, H. Lin, iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Analytical Biochemistry 442 (2013) 118-25.
[46] W. Chen, P.M. Feng, E.Z. Deng, H. Lin, iTIS-PseTNC: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition. Analytical Biochemistry 462 (2014) 76-83.
[47] H. Ding, E.Z. Deng, L.F. Yuan, L. Liu, H. Lin, W. Chen, iCTX-Type: A sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Research International (BMRI) 2014 (2014) 286419.
[48] B. Liu, L. Fang, S. Wang, X. Wang, H. Li, Identification of microRNA precursor with the degenerate K-tuple or Kmer strategy. Journal of Theoretical Biology 385 (2015) 153-159.
[49] J. Jia, Z. Liu, X. Xiao, B. Liu, iSuc-PseOpt: Identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset. Anal Biochem 497 (2016) 48-56.
[50] J. Jia, L. Zhang, Z. Liu, X. Xiao, pSumo-CD: Predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general PseAAC. Bioinformatics 32 (2016) 3133-3141.
[51] B. Liu, L. Fang, R. Long, X. Lan, iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition. Bioinformatics 32 (2016) 362-369.
[52] W. Chen, P. Feng, H. Yang, H. Ding, H. Lin, iRNA-AI: identifying the adenosine to inosine editing sites in RNA sequences. Oncotarget 8 (2017) 4208-4217.
[53] W. Chen, H. Ding, X. Zhou, H. Lin, iRNA(m6A)-PseDNC: Identifying N6-methyladenosine sites using pseudo dinucleotide composition. Analytical Biochemistry 561-562 (2018) 59-65.
[54] W. Chen, P. Feng, H. Yang, H. Ding, H. Lin, iRNA-3typeA: identifying 3-types of modification at RNA’s adenosine sites. Molecular Therapy: Nucleic Acid 11 (2018) 468-474.
[55] W.R. Qiu, B.Q. Sun, X. Xiao, Z.C. Xu, J.H. Jia, iKcr-PseEns: Identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics 110 (2018) 239-246.
[56] P. Feng, H. Yang, H. Ding, H. Lin, W. Chen, iDNA6mA-PseKNC: Identifying DNA N(6)-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 111 (2019) 96-102.
[57] W. Hussain, Y.D. Khan, N. Rasool, S.A. Khan, SPrenylC-PseAAC: A sequence-based model developed via Chou's 5-steps rule and general PseAAC for identifying S-prenylation sites in proteins. J Theor Biol 468 (2019) 1-11.
[58] J. Jia, X. Li, W. Qiu, X. Xiao, iPPI-PseAAC(CGR): Identify protein-protein interactions by incorporating chaos game representation into PseAAC. Journal of Theoretical Biology 460 (2019) 195-203.
[59] Y.D. Khan, M. Jamil, W. Hussain, N. Rasool, S.A. Khan, pSSbond-PseAAC: Prediction of disulfide bonding sites by integration of PseAAC and statistical moments. J Theor Biol 463 (2019) 47-55.
[60] Y. Lu, S. Wang, J. Wang, G. Zhou, Q. Zhang, X. Zhou, B. Niu, Q. Chen, An Epidemic Avian Influenza Prediction Model Based on Google Trends. Letters in Organic Chemistry 16 (2019) 303-310.
[61] Y.D. Khan, A. Batool, N. Rasool, A. Khan, Prediction of nitrosocysteine sites using position and composition variant features. Letters in Organic Chemistry 16 (2019) 283-293.
[62] X. Cheng, X. Xiao, pLoc_bal-mPlant: predict subcellular localization of plant proteins by general PseAAC and balancing training dataset Curr Pharm Des 24 (2018) 4013-4022.
[63] J.X. Li, S.Q. Wang, Q.S. Du, H. Wei, X.M. Li, J.Z. Meng, Q.Y. Wang, N.Z. Xie, R.B. Huang, Simulated protein thermal detection (SPTD) for enzyme thermostability study and an application example for pullulanase from Bacillus deramificans. Curr Pharm Des 24 (2018) 4023-4033.
[64] A.W. Ghauri, Y.D. Khan, N. Rasool, S.A. Khan, pNitro-Tyr-PseAAC: Predict nitrotyrosine sites in proteins by incorporating five features into Chou's general PseAAC. Curr Pharm Des 24 (2018) 4034-4043.
[65] K.C. Chou, X. Cheng, X. Xiao, pLoc_bal-mEuk: predict subcellular localization of eukaryotic proteins by general PseAAC and quasi-balancing training dataset. Med Chem 15 (2019) 472-485.
[66] X. Xiao, X. Cheng, G. Chen, Q. Mao, pLoc_bal-mGpos: predict subcellular localization of Gram-positive bacterial proteins by quasi-balancing training dataset and PseAAC. Genomics doi:10.1016/j.ygeno.2018.05.017 (2018).
[67] X. Xiao, X. Cheng, G. Chen, Q. Mao, pLoc_bal-mVirus: predict subcellular localization of multi-label virus proteins by PseAAC and IHTS treatment to balance training dataset. Med Chem 15 (2019) 496-509.
[68] K.C. Chou, Some remarks on protein attribute prediction and pseudo amino acid composition (50th Anniversary Year Review, 5-steps rule). Journal of Theoretical Biology 273 (2011) 236-247.
[69] K.C. Chou, Advance in predicting subcellular localization of multi-label proteins and its implication for developing multi-target drugs. Current Medicinal Chemistry doi: 10.2174/0929867326666190507082559 (2019).
[70] B. Liu, F. Liu, X. Wang, J. Chen, L. Fang, Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Research 43 (2015) W65-W71.
[71] B. Liu, H. Wu, Pse-in-One 2.0: An improved package of web servers for generating various modes of pseudo components of DNA, RNA, and protein sequences. Natural Science 9 (2017) 67-91.
[72] C.T. Zhang, An optimization approach to predicting protein structural class from amino acid composition. Protein Science 1 (1992) 401-408.
[73] K.C. Chou, D.W. Elrod, Bioinformatical analysis of G-protein-coupled receptors. Journal of Proteome Research 1 (2002) 429-433.
[74] K.C. Chou, Y.D. Cai, Prediction and classification of protein subcellular location: sequence-order effect and pseudo amino acid composition. Journal of Cellular Biochemistry (Addendum, ibid. 2004, 91, 1085) 90 (2003) 1250-1260.
[75] L. Hu, T. Huang, X. Shi, W.C. Lu, Y.D. Cai, Predicting functions of proteins in mouse based on weighted protein-protein interaction network and protein hybrid properties PLoS ONE 6 (2011) e14556.
[76] Y.D. Cai, K.Y. Feng, W.C. Lu, Using LogitBoost classifier to predict protein structural classes. Journal of Theoretical Biology 238 (2006) 172-176.
[77] K.C. Chou, Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21 (2005) 10-19.
[78] A. Dehzangi, R. Heffernan, A. Sharma, J. Lyons, K. Paliwal, A. Sattar, Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou's general PseAAC. J Theor Biol 364 (2015) 284-294.
[79] M. Behbahani, H. Mohabatkar, M. Nosrati, Analysis and comparison of lignin peroxidases between fungi and bacteria using three different modes of Chou's general pseudo amino acid composition. J Theor Biol 411 (2016) 1-5.
[80] M. Kabir, M. Hayat, iRSpot-GAEnsC: identifing recombination spots via ensemble classifier and extending the concept of Chou's PseAAC to formulate DNA samples. Molecular Genetics and Genomics 291 (2016) 285-96.
[81] P.K. Meher, T.K. Sahu, V. Saini, A.R. Rao, Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou's general PseAAC. Sci Rep 7 (2017) 42362.
[82] Z. Ju, J.J. He, Prediction of lysine propionylation sites using biased SVM and incorporating four different sequence features into Chou's PseAAC. J Mol Graph Model 76 (2017) 356-363.
[83] B. Yu, S. Li, W.Y. Qiu, C. Chen, R.X. Chen, L. Wang, M.H. Wang, Y. Zhang, Accurate prediction of subcellular location of apoptosis proteins combining Chou's PseAAC and PsePSSM based on wavelet denoising. Oncotarget 8 (2017) 107640-107665.
[84] J. Ahmad, M. Hayat, MFSC: Multi-voting based Feature Selection for Classification of Golgi Proteins by Adopting the General form of Chou's PseAAC components. J Theor Biol 463 (2018) 99-109.
[85] S. Akbar, M. Hayat, iMethyl-STTNC: Identification of N(6)-methyladenosine sites by extending the Idea of SAAC into Chou's PseAAC to formulate RNA sequences. J Theor Biol 455 (2018) 205-211.
[86] E. Contreras-Torres, Predicting structural classes of proteins by incorporating their global and local physicochemical and conformational properties into general Chou's PseAAC. J Theor Biol 454 (2018) 139-145.
[87] S. Zhang, Y. Liang, Predicting apoptosis protein subcellular localization by integrating auto-cross correlation and PSSM into Chou's PseAAC. J Theor Biol 457 (2018) 163-169.
[88] J. Ahmad, M. Hayat, MFSC: Multi-voting based feature selection for classification of Golgi proteins by adopting the general form of Chou's PseAAC components. J Theor Biol 463 (2019) 99-109.
[89] M. Tahir, M. Hayat, S.A. Khan, iNuc-ext-PseTNC: an efficient ensemble model for identification of nucleosome positioning by extending the concept of Chou's PseAAC to pseudo-tri-nucleotide composition. Mol Genet Genomics 294 (2019) 199-210.
[90] K.C. Chou, An unprecedented revolution in medicinal chemistry driven by the progress of biological science. Current Topics in Medicinal Chemistry 17 (2017) 2337-2358.
[91] H.B. Shen, PseAAC: a flexible web-server for generating various kinds of protein pseudo amino acid composition. Analytical Biochemistry 373 (2008) 386-388.
[92] P. Du, X. Wang, C. Xu, Y. Gao, PseAAC-Builder: A cross-platform stand-alone program for generating various special Chou's pseudo amino acid compositions. Analytical Biochemistry 425 (2012) 117-119.
[93] D.S. Cao, Q.S. Xu, Y.Z. Liang, propy: a tool to generate various modes of Chou's PseAAC. Bioinformatics 29 (2013) 960-962.
[94] P. Du, S. Gu, Y. Jiao, PseAAC-General: Fast building various modes of general form of Chou's pseudo amino acid composition for large-scale protein datasets. International Journal of Molecular Sciences 15 (2014) 3495-3506.
[95] K.C. Chou, Pseudo amino acid composition and its applications in bioinformatics, proteomics and system biology. Current Proteomics 6 (2009) 262-274.
[96] W. Chen, H. Lin, Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. Mol BioSyst 11 (2015) 2620-2634.
[97] B. Liu, F. Yang, D.S. Huang, iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC. Bioinformatics 34 (2018) 33-40.
[98] M. Tahir, H. Tayara, K.T. Chong, iRNA-PseKNC(2methyl): Identify RNA 2'-O-methylation sites by convolution neural network and Chou's pseudo components. J Theor Biol 465 (2019) 1-6.
[99] K.C. Chou, H.B. Shen, Recent advances in developing web-servers for predicting protein attributes. Natural Science 1 (2009) 63-92
[100] X. Cheng, X. Xiao, pLoc-mPlant: predict subcellular localization of multi-location plant proteins via incorporating the optimal GO information into general PseAAC. Molecular BioSystems 13 (2017) 1722-1727.
[101] X. Cheng, X. Xiao, pLoc-mVirus: predict subcellular localization of multi-location virus proteins via incorporating the optimal GO information into general PseAAC. Gene (Erratum: ibid., 2018, Vol.644, 156-156) 628 (2017) 315-321.
[102] X. Cheng, X. Xiao, pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics 110 (2018) 50-58.
[103] X. Cheng, X. Xiao, pLoc-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general PseAAC. Genomics 110 (2018) 231-239.
[104] X. Cheng, S.G. Zhao, W.Z. Lin, X. Xiao, pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites. Bioinformatics 33 (2017) 3524-3531.
[105] X. Xiao, X. Cheng, S. Su, Q. Nao, pLoc-mGpos: Incorporate key gene ontology information into general PseAAC for predicting subcellular localization of Gram-positive bacterial proteins. Natural Science 9 (2017) 331-349.
[106] X. Cheng, X. Xiao, pLoc-mHum: predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information. Bioinformatics 34 (2018) 1448-1456.
[107] X. Cheng, S.G. Zhao, X. Xiao, iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals. Bioinformatics (Corrigendum, ibid., 2017, Vol.33, 2610) 33 (2017) 341-346.
[108] P. Feng, H. Ding, H. Yang, W. Chen, H. Lin, iRNA-PseColl: Identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into PseKNC. Molecular Therapy - Nucleic Acids 7 (2017) 155-163.
[109] B. Liu, S. Wang, R. Long, iRSpot-EL: identify recombination spots with an ensemble learning approach. Bioinformatics 33 (2017) 35-41.
[110] B. Liu, F. Yang, 2L-piRNA: A two-layer ensemble classifier for identifying piwi-interacting RNAs and their function. Molecular Therapy - Nucleic Acids 7 (2017) 267-277.
[111] W.R. Qiu, S.Y. Jiang, Z.C. Xu, X. Xiao, iRNAm5C-PseDNC: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition. Oncotarget 8 (2017) 41178-41188.
[112] W.R. Qiu, B.Q. Sun, X. Xiao, D. Xu, iPhos-PseEvo: Identifying human phosphorylated proteins by incorporating evolutionary information into general PseAAC via grey system theory. Molecular Informatics 36 (2017) UNSP 1600010.
[113] X. Cheng, W.Z. Lin, X. Xiao, pLoc_bal-mAnimal: predict subcellular localization of animal proteins by balancing training dataset and PseAAC. Bioinformatics 35 (2019) 398-406.
[114] X. Cheng, X. Xiao, pLoc_bal-mGneg: predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC. Journal of Theoretical Biology 458 (2018) 92-102.
[115] K.C. Chou, X. Cheng, X. Xiao, pLoc_bal-mHum: predict subcellular localization of human proteins by PseAAC and quasi-balancing training dataset Genomics doi:10.1016/j.ygeno.2018.08.007 (2018).