👋 Howdy, I'm Houjun Liu!

I’m a first-year undergraduate student in the Computer Science Department at Stanford University, advised by Prof. Mykel Kochenderfer. I’m interested in Natural Language Processing and Speech Language Sample Analysis, specifically, in making large models solve important problems through 1) building better tools for language and speech processing to democratize state-of-the-art research 2) designing improved algorithmic approaches to language model training + decoding to improve performance and 3) exploring their applications.

Welcome to my academic homepage! This is my little homestead on the internet about my academic interests. Check out my projects below. If you want to know more about the rest of my life, feel free to visit my website!

Recent goings on
Feb. 26-27, 24' AAAI 2024! See y'all in Vancouver! Dec. 15, 23' Paper (NACC) Accepted by W3PHAI-24 Dec. 3, 23' Released TalkBank Utterance Model Jun. 22, 23' Paper (Batchalign) Published by JSLHR

Shoot me an email at [firstname] at stanford dot edu, or, if you are around Stanford, grab dinner with me :)


I am a research engineer at the TalkBank Project at CMU under the supervision of Prof. Brian MacWhinney, where I develop better models and tools for clinical language sample analysis. I also work with the Stanford NLP Group, under direction of Prof. Chris Manning, on using neural models to solve semantic and syntax tagging tasks efficiently with Stanza. Finally, I am a research assistant with Prof. Xin Liu at UC Davis and at UC Davis Health, where I use transformer models to push our understanding of dementia.

In industry, I lead the development of Condution, simon, and am a managing partner at #!/Shabang. Previously, I worked as a consulting ML engineer at Dragonfruit AI under the AI Operations team.


UC Davis Health (2023)
A Transformer Approach to Congnitive Impairment Classification and Prediction
Liu, H., Weakley, A.M., Zhang, J., Liu, X.
Talk@NACCIn Press@W3PHAI-24 at AAAI
TalkBank (2023)
Automation of Language Sample Analysis
Liu, H., MacWhinney, B., Fromm, D., Lanzi, A.
TalkBank (2023)
DementiaBank: Theoretical Rationale, Protocol, and Illustrative Analyses
Lanzi, A., Saylor, A.K., Fromm, D., Liu, H., MacWhinney, B., Cohen, M.L.
Nueva (2022)
ConDef: Automated Context-Aware Lexicography Using Large Online Encyclopedias
Liu, H., Sayyah, Z.
Preprint (2021)
Towards Automated Psychotherapy via Language Modeling
Liu, H.


  • 2023- Teaching Assistant, Stanford Association for Computing Machinery (ACM) Chapter
  • 2022-2023 Head TA (co-instructor and summer lecturer) at AIFS AIBridge, a program funded by UC Davis Food Science
  • 2021-2023 Co-Developer, Research@Nueva, a high-school science education program

Course Notes

Some folks have mentioned that my course notes through classes at Stanford and before have been helpful. Feel free to browse my Stanford UG Courses Index if you want to check it out!

© 2019-2024 Houjun Liu. Licensed CC BY-NC-SA 4.0. This website layout is inspired by the lovely homepage of Karel D’Oosterlick.