Credits & Attributions
Cella.one stands on the shoulders of researchers and open-source maintainers. We cite every third-party dataset and model we ship.
Lexicons and Psycholinguistic Norms
Concreteness Norms — English
Brysbaert, M., Warriner, A. B., & Kuperman, V. (2014). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46, 904–911.
License: Free for research and commercial use with attribution. Source
NRC Emotion Lexicon
Mohammad, S. M., & Turney, P. D. (2013). Crowdsourcing a Word-Emotion Association Lexicon. Computational Intelligence, 29(3), 436–465.
License: Free for research and commercial use with attribution. Source
Word Frequency Norms (SUBTLEX)
Brysbaert, M., & New, B. (2009). Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods, 41(4), 977–990.
License: Free for research and commercial use with attribution. Source
Imageability Norms
Pending source verification. Will be added in a follow-up release.
Machine Learning Models
MiniLM (Sentence Embeddings, Client-Side)
Model: Xenova/all-MiniLM-L6-v2 (ONNX port of sentence-transformers/all-MiniLM-L6-v2).
Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., & Zhou, M. (2020). MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers. NeurIPS 2020.
Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. EMNLP 2019.
License: Apache 2.0. Source
About this page
This page exists because attribution is not optional. Every dataset or model above was released under a license that requires acknowledgment, and because we believe credit matters regardless of license strength. If you notice a missing citation, email legal@cella.one.