Liang Huang

Professor of Computer Science, School of EECS
Professor (by courtesy) of Biochemistry & Biophysics
Faculty, Bioengineering Graduate Program
Affiliated Faculty, Center for Quantitative Life Sciences (CQLS)
Oregon State University (official profile)

I am a computational linguist and computational biologist, most fascinated by the shared mathematical foundations between the two.

Research Areas: NLP (simultaneous translation, parsing), compbio (RNA folding, RNA design, homologous folding, protein design), ML (structured prediction, weakly-supervised learning).
Research Focus: efficient algorithms and provable theory in the above areas.

  • 2024/02: We (PI Huang, co-PI Mathews) were one of the 9 teams nationwide to receive major funding for interdisciplinary RNA research. See news release from NSF/NIH.
  • 2023/11: I gave a Distinguished Lecture at UW Paul Allen School of CSE (video).
  • 2023/10: I gave an invited talk at the 11th mRNA Health Conference, co-organized by 2023 Nobel Laureate Katalin Kariko. Dr. Kariko commented on my work in her keynote.
  • 2023/05: Landmark Nature paper (pre-)published (as an accelerated article preview); see also Nature news. I reduced mRNA design to lattice parsing (1961, Bar-Hillel construction, i.e., "the intersection of context-free and regular languages is still context-free").
  • 2023/03: My first PhD student Ashish Vaswani (1st-author of the Transformer paper) was featured in this USC story about ChatGPT.
  • 2022/11: LinearSampling published by Nucleic Acids Research.
  • 2021/11: LinearTurboFold published by PNAS. First PNAS paper from OSU College of Engineering.
  • 2021/07: Delivered ISMB 2021 Integrative RNA COSI Keynote.
  • 2021/06: Delivered CVPR 2021 Invited Talk.
  • 2020/04: two ACL 2020 papers and one ISMB 2020 paper (linearpartition) accepted.
  • 2019/06: Delivered ACL 2019 Keynote Speech (attended by 2000+ researchers).
  • 2019/05: We have 3 ACL 2019 papers accepted.
  • 2019/05: LinearFold is accepted by ISMB 2019 (top conference in bioinformatics and comp. bio.) and will appear in Bioinformatics in July.
  • 2019/04: I was recognized as one of the most influential NLP researchers in the past ten years (2007-2017) by the CS citation index Aminer from Tsinghua University.
  • 2018/10: groundbreaking work in simultaneous translation -- first practical system that achieves high quality and low latency (done at Baidu Research). Media coverage by IEEE Spectrum, MIT Technology Review, FORTUNE, etc. Also covered by two podcasts: Data Skeptic and Eye on AI.
  • 2018/09: Ph.D. student Mingbo Ma defended (4th PhD graduate) and will become a Research Scientist at Baidu Research (Silicon Valley).
  • 2018/08: 4 papers (3 long, 1 short) in EMNLP 2018.
  • 2018/07: Funded by NSF, NIH, and Intel.
  • 2018/05: SIGMOD 2018 Best Paper Finalist.
  • 2018/03: csrankings.org ranks OSU 15th in NLP (mostly our group), 20th in broad AI, and 36th in CS (in the US).
  • 2018/02: We released the world's fastest RNA secondary structure prediction server, powered by the first linear-time prediction algorithm, based on our earlier work in computational linguistics. It is orders of magnitude faster than existing ones, with comparable or even higher accuracy. Code on Github.
  • 2017/10: Serving as an Senior Area Chair (syntax/parsing) for ACL 2018.
  • 2017/09: I'm giving an invited talk at EMNLP 2017 Workshop on Structured Prediction. [slides]
  • 2017/07: three papers accepted by EMNLP 2017. I had my first first-author paper after a 5-year gap; it feels great getting back to real work!
  • 2017/06: Ph.D. alumni Ashish Vaswani's recent work on attention received a lot of attention itself, bringing a revolution in deep learning for NLP.
  • 2017/06: Ph.D. student Mingbo Ma participated in the WMT 2017 competition on multimodal translation, and achieved the best TER score among 15 systems on the English+Image->German COCO task (the hardest task in that competition, since it's into German rather than French, and tested on out of domain images).
  • 2017/05: Ph.D. student Kai Zhao successfully defended and accepted a rare Research Scientist offer from Google Research (NYC).
  • 2017/05: We were part of the OSU team in the DARPA Explanable AI (XAI) grant ($6.5M total).
  • 2017/01: We are hosting Oregon's first and only University site for North American Computational Linguistics Olympiad (NACLO)!
  • 2016/12: Ph.D. student James Cross successfully defended, and joined Facebook as a Research Scientist.
  • 2016/11: James's EMNLP 2016 paper received an Honorable Mention for Best Paper (and unanimous full-score reviews).
  • 2016/07: Ph.D. alumni (USC) Ashish Vaswani joined Google Brain as a Research Scientist.
  • 2015/09: Postdoc Feifei Zhai joined IBM T. J. Watson as a Research Staff Member.
  • 2015/09: Our group moved to Oregon State University after three years at the City University of New York.
  • 2015/07: We received a Yahoo! Faculty Engagement Award.
I was known as a computational linguist (parsing, translation, algorithms and theory), but in recent years I became more interested in adapting my NLP algorithms to computational biology, thanks to the shared mathematical foundations between the two seemingly distant fields. In particular, I have been working on efficient (mostly linear-time) algorithms for RNA folding, mRNA design, homologous folding, and RNA design. More interestingly, when COVID-19 hit, this line of work became much more relevant because SARS-CoV-2 is the longest RNA virus known today (~30,000 nucleotides) which requires linear runtime, and because mRNA vaccine is the best way to prevent it but mRNA design is computationally challenging. Therefore, my work has made impact on the fight against COVID-19, and has resulted in high-profile papers such as LinearTurboFold (PNAS 2021) and LinearDesign (Nature 2023).

I am most interested in the theoretical and algorithmic aspects of biology and language, and many of my NLP/bio papers draw unexpected connections from theoretical computer science, e.g., my synchronous binarization algorithm (binarizing a synchronous context-free grammar in linear-time) was inspired by Graham Scan for Convex Hull, my LinearDesign algorithm (optimal mRNA design) uses the intersection between context-free and regular languages, and my k-best parsing algorithms are often featured in my Algorithms courses.

On the other hand, my recent work in NLP pioneered the field of simultaneous translation, turning it from obscurity to spotlight. See my ACL 2019 Keynote and CVPR 2021 Invited Talk.

Listing of my papers on Google Scholar, Semantic Scholar, and a subset (top conferences) in csrankings.org.

Recent and Past Talks [playlist]

Selected Recent Work

(for the full list please visit my Google Scholar page)

Older Representative Publications (organized by techniques)

For representative or recent publications please see the "Research" Tab.

For a more complete list you can visit my google scholar page (or by year). Some time in the future I will compile a real publication list, but for the time-being please use google scholar. You can also use Semantic Scholar (by AI2) to see the influences of my work.

Some lesser-known work:

  1. K-best Knuth Algorithm. Tech Report, 2005.

    This brief note develops a k-best extension of the powerful Knuth (1977) algorithm, which itself extends the classical Dijkstra (1959) shortest-path from graphs to directed hypergraphs (AND-OR graphs) and context-free grammars. My extension was quite straightforward, following a simple k-best Dijkstra extension by Mohri and Riley (2002), combined with the lazy k-best frontier idea from my k-best CKY parsing work (Huang and Chiang, 2005). I thought this algorithm was only of theoretical interest, and was surprised (and glad) when Pauls and Klein (2009) further developed it to "k-best A* parsing" by adding admissible heuristics, which won the ACL 2009 Best Paper Award.

    See also [3] below on the relationship between A*, Knuth, and Dijkstra.

    As a side note, the k-best parsing algorithms (Huang and Chiang, 2005) were used in the award winning papers in ACL 2005, NAACL 2006, ACL 2008, and ACL 2009.

  2. Statistical Mechanics of Helix Bundles using a Dynamic Programming Approach, with Adam Lucas, Aravind Joshi, and Ken Dill.
    J. Am. Chem. Soc. (JACS), 129 (14), pp. 4272-4281.

    I used to work on computational biology during my PhD, which resulted in this paper. Although published in the top chemistry journal, it does not get much attention at all, which I believe was due to the difficulty of the technique (dynamic programming) for a chemistry audience (a computational biology audience would have been much better -- they use DP a lot, e.g. Smith-Waterman is edit-distance; even CKY is widely used in comp-bio). (more DP details such as function operators left to the Supplement).

  3. Tutorial: Advanced Dynamic Programming in Semiring and Hypergraph Frameworks.
    NAACL 2009 and COLING 2008. [slides] (based on my candidacy exam and Chapter 2 of my thesis)

    I present a unified algebraic theory of dynamic programming based on monotonic weight functions, semirings (distributivity implies monotonicity), and hypergraphs (distributivity is the key to factorization). Here DP includes both Viterbi-style (topological order, requires acyclicity of the graph) and Dijkstra-style (best-first, requires superiority of the semiring); also DP includes both optimization (using idempotent semirings) and summation/expection (using inside/expectation semirings). This view of DP is much broader than those in the textbooks. The only implementation paradigm not covered in my framework is "memoized top-down recursion", which, as the CLRS book also pointed out (Sec. 15.1), could also be viewed as the DFS version of topological sort (whereas Viterbi, or bottom-up, resembles a BFS version of topological sort). A* search on graphs and hypergraphs (e.g. A* parsing) are also discussed as generalizations of Dijkstra 1959 and Knuth 1977.

    It was the most popular tutorial in both COLING 08 and NAACL 09.

  4. The Art of Algorithms and Programming Contests, with Rujia Liu.
    Tsinghua University Press, 2003. Baidu Baike entry (in Chinese). National Best Seller in CS.

    We wrote this book in college while I was training alongside the SJTU team which won the World Champions for the first time. When it was first published I had just come to the States and was no longer active in the programming contest community, so I had no clue that it would become a best-seller in the field of programming contests (I initially thought it would only sell about a thousand copies). I wrote one of the three chapters of the book, the one on computational geometry, my second favorite topic in algorithms after dynamic programming.

    After many years of hiatus, I finally resumed contest coaching, at USC (encouraged by David Kempe) and later at CUNY (encouraged by Jerry Waxman), helping the former win its first Southern California Regionals in 2011 (and again in 2012), and helping the latter achieve its best ranking in Greater New York Regionals in 10 years. I developed a course in CUNY to train for ACM/ICPC (and also for job interviews as a by-product), which became students' favorite class in the CS Dept. I hope to write a new book based on the course notes (with Python code for each problem). I really think ACM/ICPC should allow Python in all Regionals and the World Finals (many Regionals in the US already support Python, including my current region, the Pacific Northwest). I also advocate for more algorithmic problems in the Regionals (esp. in the US), and more geometry problems in the Finals.

Teaching is always rewarding, and is indeed the best part of being a professor.

Current teaching at Oregon State: [lightboard teaching videos]

Past Teaching at CUNY:

Past Teaching at USC:

Past Teaching at Penn:

  • CSE 399: Python Programming (new course, Spring 2006)
    Chosen by the Department to be the first graduate student ever to teach a course.
  • Recitation Instructor, CSE 320, Algorithms (Spring 2005, Prof. Sanjeev Khannna)
  • Recitation Instructor, CSE 262, Automata and Formal Language Theory (Fall 2004, Prof. Jean Gallier).
  • Awarded University Graduate Teaching Prize, 2005. [more details]
Welcome to the Algorithms for Computational Linguistics (ACL) Group! We study efficient algorithms and provable theory in computational linguistics, computational structural biology, and structured machine learning.

I am very fortunate to have been working with the following students:

  • PhD Students:
    • Ning Dai, BS, CS, Fudan. Interned at Bytedance AI Lab. Joined Fall 2021.
    • Tianshuo Zhou, BS, Civil, Southeast; MS, CS, Nanjing U. Worked at Bytedance. Joined Summer 2022.
    • Zetian Wu, BS, Physics, Zhejiang U.; MS, DS, JHU. Joined Fall 2022.
    • Apoorv Malik. converted from MS to PhD, Winter 2024.
    • Milan Gautam, BS, ECE, Nepal. Joined Winter 2024.
  • MS Students:
  • Undergraduate Students:
    • Wei Yu Tang. Double major in Math and CS with 4.0 GPA. Joining USC PhD program in Fall 2024.
    • Otso Barron. Captain, Oregon State ACM/ICPC teams. President, ACM Student Chapter.

ACL Group Alumni:

  • PhD Graduates (10):
    • (#10, OSU, 2022, NLP) Junkun Chen, BS, CS, Central South; MS, CS, Fudan. Interned at Bytedance AI Lab. Joined Fall 2019. Defended Dec. 2012. Best Demo Paper Award, NAACL 2022. First job: Sr. Applied Scientist, Microsoft Research.
    • (#9, OSU, 2022, compbio) Sizhen Li, BS, EE, MS, CS, BUPT. Interned at Bytedance AI Lab and Baidu Research USA. Joined Fall 2019. Defended Nov. 2022. Published the first PNAS paper in the history of OSU College of Engineering. First job: Research Scientist, Sanofi. (de facto co-advisor: Prof. David Mathews of Rochester).
    • (#8, OSU, 2022, compbio+NLP) Juneki Hong (co-advised by David Hendrix), BS, CS, JHU; MS, CS, CMU. Joined Fall 2016. Defended June 2022. First job: NLP Algorithm Engineer, Bytedance.
    • (#7, OSU, 2022, compbio) Liang Zhang, BS, EE, Nankai; MS, EE, HKUST; MS, Finance, LBS. Joined Spring 2019. Defended May 2022.
    • (#6, OSU, 2021, compbio) He Zhang, BS/MS, Beihang. MS, OSU, 2018. Returned to PhD program, 2020. Defended Sep. 2021. First job: Research Scientist at Baidu Research USA. (de facto co-advisor: Prof. David Mathews of Rochester).
    • (#5, OSU, 2020, NLP) Renjie Zheng, BS/MS, CS, Tongji. Joined ACL Group in Fall 2017 (officially Winter 2018). Interned at Fudan University and Baidu Research. Defended Mar. 2020. First job: Research Scientist at Baidu Research USA.
    • (#4, OSU, 2018, NLP) Mingbo Ma, BS, EE, Jilin. Joined ACL Group in 2013. Interned at IBM TJ Watson, Apple, and Baidu Research. Defended Sep. 2018. First job: Research Scientist at Baidu Research USA. Now Tech Lead Manager, Tiktok.
    • (#3, OSU, 2017, NLP) Kai Zhao, BS, CS, USTC. Joined ACL Group in 2012. Interned at IBM TJ Watson, MSR, and Google Research. 11 ACL/EMNLP/NAACL papers. Defended May 2017. First job: Research Scientist at Google (NYC).
    • (#2, OSU, 2016, NLP) James Cross, BA, French and BS, CS, Auburn; JD, NYU. Member of NY Bar. Interned at IBM TJ Watson and Facebook. Best Paper Honorable Mention at EMNLP 2016. Thesis defended Dec. 2016. First job: Research Scientist at Facebook AI (now FAIR).
    • (#1, USC, 2014, NLP) Ashish Vaswani (co-advised by David Chiang). Interned at IBM TJ Watson and Google Research. Thesis defended May 2014. First job: Research scientist at USC/ISI. Current job: Senior Research Scientist at Google Brain. First author of the attention is all you need paper.
  • Postdocs (2)
    • (#2, CUNY) Feifei Zhai (Ph.D., CAS/IA), postdoc, 2014--2015. First job: research staff member at IBM TJ Watson.
    • (#1, CUNY) Lemao Liu (Ph.D., Harbin), postdoc, 2013--2014. First job: researcher at NICT, Nara, Japan.
  • MS Thesis Graduates (4)
    • Matthew Meyn, BS, Music, Santa Clara; BS, CS, OSU (e-campus). Defended CS MS Thesis, June 2019.
    • He Zhang, BS/MS, EE, Beihang. Defeneded CS MS thesis, Sep 2018. First job: Research Engineer at Baidu Research USA.
    • Kaibo Liu, BS, Physics, PKU; MS, EE, PKU. Defended CS MS thesis, Dec 2018. First job: Research Engineer at Baidu Research USA.
    • Luyao Zhang, BS, ZJU; PhD, USC. Defended CS MS thesis, Sep 2016. First job: CS Lecturer, Oregon State.

My past summer interns include Sasha Rush (Ph.D., MIT, now Prof. at Harvard), Yoav Goldberg (Ph.D., Ben Gurion, now Prof. at Bar Ilan), Heng Yu (Ph.D., CAS/ICT, now with Samsung), and Zhuoran Yu (M.S., NYU, now with Google NYC's NLP research team).

To prospective students: please use csrankings.org for a data-driven and more up-to-date CS ranking than the purely subjective and totally obsolete US News system. If you look at the past ten years (2007--2017), we are ranked 15th in NLP (mostly our group), 18th in vision, 21st in narrow AI (AAAI/IJCAI), and 20th in broad AI. So the AI group is super strong. Other strong groups include Software Engineering (11th), Crypto (15th), and HCI (23rd). The overall CS ranking is 36th and is improving.

I am from Shanghai, China and speak Wu as my native language.

CV [Dec. 2023]

Education

Ph.D., Computer Science, University of Pennsylvania, 2008. (old homepage) [thesis] [slides]
B.S., Computer Science, Shanghai Jiao Tong University, 2003. summa cum laude. (minor studies in French and English)

Professional Experience

Professor (by courtesy) of Biochemistry/Biophysics, College of Science, Oregon State University, 2024/1--present.
Professor of Computer Science, School of EECS, Oregon State University, 2023/9--present. (Assoc. Prof., 2020--2023; Asst. Prof., 2015--2020)
Distinguished Scientist and Head, Institute of Deep Learning USA, Baidu Research USA, 2018/6--2019/3 and 2021/9--2020/3 and various part-time periods in between (on leave from OSU).

Research Scientist (part-time), IBM Watson Group, 2014/6--2017/1.
Assistant Professor, Queens College and Graduate Center, City University of New York (CUNY), 2012/8--2015/8.
Research Assistant Professor, University of Southern California (USC) and Information Sciences Institute (ISI), 2009/7--2012/8.
Research Scientist, Google Research (Mountain View), 2009/1--7.

Visiting Scholar, Hong Kong Univ. of Science and Technology, 2008/10--2008/11.
Visiting Scholar, Institute of Computing Technologies, Chinese Academy of Sciences, 2007/10--2008/1.
Summer Intern, USC/ISI, 2005/5--10 and 2006/5--10.

Awards / Honors

Best Demo Paper Award, NAACL 2022.
Keynote, iRNA (Integrative RNA Biology) COSI, ISMB 2021
Invited Talk, CVPR 2021
Keynote, ACL 2019 (first ACL keynote by a Chinese NLP researcher)
Best Paper Finalist, SIGMOD 2018.
Best Paper Honorable Mention, EMNLP 2016 (Cross and Huang, 2016).
Best Paper Finalist, EMNLP 2008 (Mi and Huang, 2008)
Best Paper Finalist, ACL 2010 (Huang and Sagae, 2010).
Best Paper Award, ACL 2008 (Huang, 2008). (one of the four single-author best paper awards in ACL history.)
Best Paper Finalist, ACL 2007 (Huang and Chiang, 2007)
Google Faculty Research Award, 2010.
Google Faculty Research Award, 2013.
Yahoo! Faculty Research and Engagement Award, 2015.
Regional Champions (as Faculty Advisor for USC), ACM Int'l Collegiate Programming Contest (ICPC), 2011.
University Prize for Graduate Student Teaching, University of Pennsylvania, 2005. [more details]

Selected Service

Action Editor, Transactions of the ACL (TACL), 2021--
NSF Panelist, 2014, 2015, 2017 x 2, 2019 x 2, 2022
Grant Reviewer, Foreign NSFs: Hong Kong (RGC), 2017; Canadian (NSERC), 2017; Dutch (NWO), 2017; Israeli (ISF), 2015
Area Chair (sentence-level semantics), ACL 2019
Area Chair (syntax and parsing), EMNLP 2018
Senior Area Chair (syntax and parsing), ACL 2018
Area Chair (syntax and parsing), IJCNLP 2017
Area Chair (syntax and parsing), EMNLP 2016
Area Chair (machine translation), ACL 2014
Area Chair (syntax and parsing), ACL 2012
Program Co-Chair, IWPT 2013
Textbook Reviewer, Cambridge Univ. Press, 2016, and Oxford Univ. Press, 2010

Brief bio for talk announcements

Liang Huang is a Professor of Computer Science and (by courtesy) Biochemistry/Biophysics at Oregon State University. Until recently he was also a Distinguished Scientist at Baidu Research USA. He received his PhD in 2008 from the University of Pennsylvania. He is best known for his work on efficient algorithms and provable theory in computational linguistics and their applications in computational biology, where several of his algorithms have become textbook standards or have been published in prestigious journals like Nature (2023) and PNAS (2021). His recognitions include ACL 2019 Plenary Keynote, ISMB 2021 Integrative RNA Biology Keynote, CVPR 2021 Invited Talk, ACL 2008 Best Paper Award (sole author), EMNLP 2016 Best Paper Honorable Mention, NAACL 2022 Best Demo Paper Award, several best paper nominations (ACL 2007, EMNLP 2008, ACL 2010, SIGMOD 2018), two Google Faculty Research Awards (2010 and 2013), and a University Teaching Prize at Penn (2005). He also co-authored a best-selling textbook in China on algorithms for programming contests. His work on simultaneous translation and mRNA design has been covered by numerous media reports.

high resolution photo

Linear-Time RNA Folding Toolkits

This whole package of linear-time RNA folding algorithms (GitHub; web server) includes:
  • LinearFold (ISMB 2019) -- MFE/Viterbi folding
  • LinearPartition (ISMB 2020) -- partition function, marginal probabilities, and MEA
  • LinearTurboFold (PNAS 2021) -- homologous folding (alternating align and fold)
  • LinearSampling (NAR 2022) -- stochastic sampling

LinearDesign: optimal mRNA design

TBA

Span-based Parsers

By now, span-based parsing has become the dominant paradigm in neural constituency parsing.

Linear-Time Dynamic Programming Parser (with Max-Violation Perceptron Trainer)

This parser is described in the following two papers:
  1. Liang Huang and Kenji Sagae (2010). Dynamic Programming for Linear-Time Incremental Parsing.
    Proceedings of ACL 2010. Best Paper Nominee.
  2. Liang Huang, Suphan Fayong, and Yang Guo (2012). Structured Perceptron with Inexact Search.
    Proceedings of NAACL 2012.
Parser Features:
  • Extremely fast linear-time parsing (about 0.04 seconds per sentence).
  • Dynamic programming and forest output (encodes exponentially many trees).
  • k-best output (Huang and Chiang, 2005).

Trainer Features:

  • Flexibility in defining new feature templates (my Python code compiles your feature definition language into a dynamically generated Python snippet).
  • Max-Violation Perceptron Training (Huang et al 2012).
  • Parallelized Perceptron Training (McDonald et al 2010).

Download from here.
Download from github repo.


Python Sparse Vector

Written in C as a Python extension module based on collections.defaultdict.
Much faster and slimmer (4 times less memory usage) than David Chiang's svector.
Builtin support for averaged parameters in online learning (e.g. perceptron, MIRA, etc.).

Note: for decoding (e.g. parsing), defaultdict is fast enough (mine is even faster by doing dot-product in C, which is also possible via Cython), but for learning (e.g. perceptron), defaultdict becomes terrible on big data because Python float/int are immutable, which caused too many unnecessary hash operations. Using my hvector can make your learner up to 5 times faster.

Download: hvector version 1.0 (Jan 2013).


Forest-Reranking Parser

This parser/reranker is described in the following paper:
  • Liang Huang (2008). Discriminative Parsing with Non-Local Features.
    Proceedings of ACL 2008. (Best Paper Award)
    errata: Following Charniak, the dev set was section 24, not section 22.
This software has three components:
  1. The forest-dumping version of Charniak parser.
  2. The forest reranker.
  3. The perceptron trainer.
Currently part 1 is downloadable as a standalone package. Parts 2 and 3 are being packaged for release.

Important: If you're using 64-bit Ubuntu, it is recommended that you install Python from source code (see Python.org). The default Python2.7 in those Ubuntus (at least 12.04) has an obscure floating point problem which gives inconsistent results.


Papers which use multiple pieces of my software: there are many papers from other researchers which use a certain piece of my software, but there are also some which combine different pieces together, for example:
x3   x2   x2  

We gratefully acknowledge the support from external funding agencies.

  • NSF MFB ($1.45M), PI (co-PI: David Mathews of Rochester), 2024-2027.
  • NSF small, sole PI, 2020-2023.
  • NSF small, sole PI, 2018-2021.
  • NIH R01, co-I, 2018-. PI: Robert Tanguay.
  • Intel, PI, 2018-.
  • HP Seedling, PI, 2017.
  • DARPA Explanable AI (XAI), co-PI, 2017--2021. PI: Alan Fern.
  • Yahoo! Faculty Research and Engagement Award, sole PI, 2015--2016.
  • NSF EAGER, sole PI, 2014--2017.
  • Google Faculty Research Award, sole PI, 2013--2014.
  • DARPA DEFT, co-PI, 2012--2016. PI: Andrew Rosenberg.
  • Google Faculty Research Award, PI, 2010--2011.
Internal Fundings:
  • OSU EECS Collaboration Fund, 2016--2017
  • PSC-CUNY Enhanced Research Award, 2013--2014
  • USC Viterbi School of Engineering Fund, 2011--2012
Liang Huang
School of EECS, Oregon State University
1148 Kelley Engineering Center (KEC)
Corvallis, OR 97331
541-737-4694 (o)

liang dot huang dot sh at gmail or liang dot huang at oregonstate dot edu.

Simultaneous Translation

                       
Also covered by two podcasts:
   

NLP-inspired RNA folding and design for COVID

("Remarkable AI tool for mRNA design")
("A tool for optimizing mRNA")
("Understanding coronavirus is like reading a sentence")
Disclaimer: I am known to be highly opinionated, and some points below might sound offensive to some readers.

Classical Music

I am a big fan of Classical Music. The composers I admire the most are Johann Sebastian Bach (whose music is so mathematical), Peter Ilych Tchaikovsky (whose melodic talent is second only to Mozart), and Antonin Dvorak (whose music blends Bohemia with America). I also love, among others, (in chronological order) Luigi Boccherini, Wolfgang Mozart, Ludwig van Beethoven, Felix Mendelssohn, Giacomo Puccini, and Sergei Rachmaninoff. Yes, I do have a preference for Baroque, Slavic, and melodic beauty ("lyricism").

On the other hand, I don't have a taste or much respect for Richard Wagner (whom I found rather disgusting) or Franz Lizst. Compared to Frederic Chopin or Nicolo Paganini, Lizst has almost nothing original to himself (like comparing Clementi to Mozart). I don't like Haydn either: he is innovative in form, but his melodic talent is certainly not comparable to those I love. Even "his" most beautiful work, String Quartet Op. 3 No. 5 that contains the famous 2nd movement Andante Cantabile ("the Serenade") is now firmly attributed to Hoffstettter (an admirer of Haydn). When I first heard that movement without knowing this, I was like, "There is no way Haydn could have written this!" Some people even called Boccherini "Haydn's wife" (in terms of stylistic similiarity); to me that's bullshit -- Boccherini wrote many achingly beautiful themes (such as this and that) that Hadyn could never have written. Brahms, on the other hand, does have superb melodic talent, but he is so reluctant in showing it.

I think the ideal teaching style in computer science should be something like this "illustrated lecture" below (passionate, and demo on your instrument (the computer) as much as you can):

A Personal History of Languages

I grew up speaking Wu, but in a multilingual environment. Back in the old days, Shanghai was just as multicultural as New York City today with speakers and communities of all kinds of languages or dialects. When I grew up, my parents spoke Shanghainese, and my grandparents Ningbonese, which I understood but could not speak well; the majority of our neighbors, however, spoke yet another distinctive language called Lower Yangtze Mandarin, which is a "linear interpolation" between Wu and Northern Mandarin, and because of this "interpolation" nature I am still fluent in it today. I started learning Standard Mandarin as a de facto first foreign language in the elementary school, but like most of my peers, I ended up with a heavy southern accent. During college I took up French seriously and got admitted into one of the Grandes Ecoles in Paris, but forgot all of it after moving to the US; those memories somehow all came back when I later took up Spanish and Italian. On the other thand, living in the US helped me get rid of my heavy accent in Mandarin where finally the "training data" around me had more native samples than non-native ones. Living here also exposed me to other Chinese languages and dialects which I never heard back in Shanghai, such as the Upper Yangtze Mandarin (aka "Sichuan"), Cantonese, and Hokkien (aka "Taiwanese"), and more interestingly, various English and Spanish dialects. I still enjoy learning new languages and dialects today.

More on Wu Chinese.

Latin Resources

If you've studied at least one Romance languages (French, Italian, Spanish, Portugues, Romanian, etc.), then Latin shouldn't be too hard. Studying Latin has many benefits, but to me the sound changes in Latin and descendant languages really helped me understand more about Indo-European. Just the hard-vs-soft "c" alone tells you so much about Indo-European history, Centum vs. Satem split, etc. Here is a great guide on Latin pronunciation written by a fellow computational linguist Michael Covington (yes, you probably knew him for the "Convington algorithm", an O(n^2) algorithm for dependency parsing).

Enjoy some beautiful music in Latin (Operatic Pop or Catholic):

My interests in Latin were, once again, influenced by David Chiang.

What to do when stuck in traffic jam...

Well, driving in LA is like hell, so my former colleagues at ISI had developed various ways of doing something useful while stuck on the freeways. The NLP group at ISI, being a set of language lovers, is particularly fond of listening to the Pimsleur language programs while driving (someone even got a ticket!), which is arguably the best language learning tool, and I myself did it for Spanish which worked out great. However, Pimsleur doesn't work that well for Japanese and much worse for Chinese where the writing systems are (partially or completely) not phonetic which means you do have to read on the page. You might be able to do some simple Japanese conversations with Pimsleur, but reading kanas and kanjis requires different training; Chinese is much worse which made Pimsleur basically useless even for basic conversations because of homonyms and tones (Jon May used to complain about similar-sounding words such as yiyue "January", yiyuan "hospital", and yinyue "music"; the tones are also different but most westerners can't perceive it). In the same logic, Pimsleur works the best for Italian and Spanish, where the spellings are the most regular (phonemic orthography). Jason Riesa did it for Persian (among the many languages he studied) and claimed it to be really easy partly because it's a well-behaved Indo-European (which makes sense).

Besides Pimsleur, I also listened to a great many The Teaching Company courses while driving, which has a large catelog of courses in all subjects, and they made it in a way that you can get much of it just by listening (which you can do while cooking, jogging, etc) although they also have videos if you wanna study more carefully. The two professors I recommend the most are Robert Greenberg of UC Berkeley for classical music and music theory, and John McWhorter of Columbia for linguistics.

Both Pimsleur and Teaching Company CDs are extremely expensive, but you can borrow them from local libraries.

Coursera Courses

For English-language courses, I liked this one in particular: Buddhism and Modern Psychology, by Robert Wright of Princeton, and all courses by Curtis (String Quartet, Beethoven Piano Sonatas, and Western Music History).

For Chinese-language courses, I am deeply moved by the two from Lu, Shih-Hao of National Taiwan University: Shiji, and Qin Shi Huang. I wish he could offer more online and I think Taiwanese students are so lucky to be able to sit in his classrooms; compared to those in Taiwan, many history/humanities professors in the mainland are just a joke (except a handful of real scholars, such as Gao, Hua who wrote How did the Red Sun Rise). For those learning algorithms and programming, the one by PKU is recommended; Liu Jiaying is a real fun teacher.

History of Science

As you can tell I am most interested in historical things: historical geography, historical linguistics, and history of science and mathematics. Here is my old collection on aesthetics of mathematics (essays by C.N. Yang, M. Atiyah, T. Gowers, A. Connes, and T. Tao) and a special subcollection on Paul Erdos.

Michael Atiyah on the Art of Mathematics and interview video.

I highly recommend the two documentaries by the Hungarian-American George Csicsery on legendary mathematicians: N is a Number on Paul Erdos, and Counting from Infinity on Yitang Zhang. In the Erdos film, many scenes were shot on the Penn campus (the Math/Physics building DRL in particular) as he visited Prof. Herbert Wilf frequently. In the Yitang film, I love the way he speaks, with a heavy but cute Shanghainese accent, even though he moved to Beijing when he was 13. He is my model as a pure scholar (who does not need recognition or funding) and a dedicated teacher.

Aravind Joshi's Legacy in Computational Biology

Talk Shows

People that I learned the most from

So far I have been extremely grateful to these people from whom I learned a great deal of:
  • David Chiang
  • Dan Gildea
  • Kevin Knight
  • David Mathews
and to a slightly lesser extent:
  • Ryan McDonald
  • Fernando Pereira
  • Qun Liu
  • Benjamin Pierce
  • Shang-Hua Teng
I certainly learned a whole lot from many many other people. So I apologize if your name is not listed here. :)

Policitians that I despise the most

  • Winston Churchill
  • Margaret Thatcher
  • Ronald Reagon