Gensim relies on your donations for sustenance. If you like Gensim, please consider donating.

Why Gensim?

Super fast


The fastest library for training of vector embeddings – Python or otherwise. The core algorithms in Gensim use battle-hardened, highly optimized & parallelized C routines.

Data Streaming


Gensim can process arbitrarily large corpora, using data-streamed algorithms. There are no "dataset must fit in RAM" limitations.

Platform independent


Gensim runs on Linux, Windows and OS X, as well as any other platform that supports Python and NumPy.

Proven


With thousands of companies using Gensim every day, over 2600 academic citations and 1M downloads per week, Gensim is one of the most mature ML libraries.

Open source


All Gensim source code is hosted on Github under the GNU LGPL license, maintained by its open source community. For commercial arrangements, see Business Support.

Ready-to-use models and corpora


The Gensim community also publishes pretrained models for specific domains like legal or health, via the Gensim-data project.

Installation


Quick install

Run in your terminal (recommended):

pip install --upgrade gensim

or, alternatively for conda environments:

conda install -c conda-forge gensim

That's it! Congratulations, you can proceed to the tutorials.

Code dependencies

Gensim runs on Linux, Windows and Mac OS X, and should run on any other platform that supports Python 3.8+ and NumPy. Gensim depends on the following software:

  • Python, tested with versions 3.8, 3.9, 3.10 and 3.11.
  • NumPy for number crunching.
  • smart_open for transparently opening files on remote storages or compressed files.

Testing Gensim

Gensim uses continuous integration, automatically running a full test suite on each pull request:

CI service Task Build status
Github Actions Run tests on Linux and Mac, plus check code-style Github Action
AppVeyor Run tests on Windows AppVeyor
CircleCI Build documentation CircleCI

Or, to install and test Gensim locally:


                      pip install -e .  # compile and install Gensim from the current directory
                    

                      pytest gensim     # run the tests
                    

Who is using Gensim?

Doing something interesting with Gensim? Sponsor Gensim and ask to be featured among adopters.

  • “Here at Tailwind, we use Gensim to help our customers post interesting and relevant content to Pinterest. No fuss, no muss. Just fast, scalable language processing.”

    Waylon Flinn
    Tailwind
Fork on Github