Grammar is important, but it’s not as simple as memorizing all the rules and applying them. It used to be that using correct grammar in an app, website or some other form of “easy-to-use” technology wasn’t possible in the past. But now, more and more software developers are changing their tunes when it comes to allowing people complete access to thousands of grammar rules so that even machine-generated copy can be edited automatically by other programs. In this tutorial, we’ll be doing Grammar Correction using Python.
Gramformer is one such python package, it is a library that exposes 3 separate interfaces to a family of algorithms to detect, highlight and correct grammar errors. The library takes care of the technical details so all you need to do is call the methods with your sentences and get a list of suggestions or a sentence with highlighted errors. This article will go over how to use the different signatures exposed by Gramformer with Happy Transformer in python to add an extra layer of polish to your content.
Environment Setup
First, we’ll need to install Gramformer itself, it is not yet available on PyPi as it is a fairly new package to enter the market. But, we can still download it from its GitHub repo with the following command:
!pip3 install -U git+https://github.com/PrithivirajDamodaran/Gramformer.git
But, when I and many other people tried to install it, it kept installing multiple versions of its dependencies. Looking into the logs indicated: “No app image found for TensorFlowJupyter…” and finally “ERROR: ResolutionImpossible.” There’s also an open issue where for some people (including me) it installs endlessly. In addition, the package also does not provide support for modifying text generation settings. So, in order to solve this issue, we could utilize a transformer, specifically Happy Transformer to get a more stable version of Gramformer.
Install Happy Transformer with:
pip install happytransformer
Happy Transformer allows developers to easily implement state-of-the-art neural NLP models in Python. It’s built on top of the Hugging Face Transformers library that allows anyone to develop complex models with just a few lines of code. This library will make it possible to implement text classification, text generation, summarization, and more in your projects. It’s simple to use but powered by cutting-edge NLP technology!
Creating the Application
The model we’ll be using for our application is called HappyTextToText, and it performs a text-to-text task. In other words, it takes a piece of text as input and produces a standalone piece of text as output. This is also the only stable model of Gramformer as of writing this. First, create a file with the .py extension where we can store our code and import the following module:
from happytransformer import HappyTextToText
Currently, Gramformer model available is a T5 model called “prithivida/grammar_error_correcter_v1” and can be obtained by creating a “HappyTextToText” object (It is available on Hugging Face’s model distribution network).
Write the following line in your file like so:
happy_tt = HappyTextToText("T5", "prithivida/grammar_error_correcter_v1")
This class requires 2 positional models, the first position parameter of this class is the model type and the second parameter is for its name.
The T5 model available in “prithivida/grammar_error_correcter_v1” is able to perform multiple tasks with a single model (text-to-text). The model accomplishes this by learning the meaning of different prefixes before the input. By using the Gramformer model we’re using, the only prefix we need is “gec:.”
Now, we’ll define the function that contains grammar and spelling errors that we’ll correct using the Gramformer model.
text = "gec: " + "We no open, sorrry"
One interesting feature that happy transformer has that the Gramformer library currently does not is the ability to modify the text generation settings. A class called TTSettings() is used to control which algorithm is used and for which settings. We can import this exactly like we did the HappyTextToText or like this:
from happytransformer import TTSettings
And then using this class we can define various text generation settings. For example:
TTSettings(do_sample=True, top_k=50, temperature=0.7, max_length=20)
This is one of the configurations from TTSettings, here, with the algorithm considers the top_k tokens where the words or symbols, so that means the more taken you consider the more creative the text prediction will be. Similarly with temperature, as you increase the temperature the model is more likely to select less probable tokens.
You can also customize this a bit further with some additional parameters such as min_length and do_sample. You see all of them here.
Moving forward with the code, we can perform grammar correction with the happy_tt object call generate generate_text(). From here we’ll provide the text function to the first position parameter and include the settings for its “args” parameter.
result = happy_tt.generate_text(text, args=settings)
And finally, print the result.
print(result.text)
And we are done!
The code would look like this in its entirety. I have also made some changes to get the best possible results.
from happytransformer import HappyTextToText
from happytransformer import TTSettings
happy_tt = HappyTextToText("T5", "prithivida/grammar_error_correcter_v1")
text = "gec: " + "We no open, sorrry"
settings = TTSettings(do_sample=True, top_k=10, temperature=0.5, min_length=1, max_length=100)
result = happy_tt.generate_text(text, args=settings)
print(result.text)
Output
Now for the results, you can run this file as you would any other python file or just run:
python {file_name}.py #Windows
python3 {file_name}.py #Mac/Linux
And it might take a couple of minutes to run if you don’t have a powerful system, but it would return something like this:
The result is for the sentence “We no open, sorry” so you can clearly see that it works. Now, as I mentioned earlier you can define the intensity of the changes it makes so you easily work with more loosely structured sentences and their grammar, not just simple spelling mistakes.
Final Words
Gramformer is an open-source library that makes it easy to correct and highlight grammar errors. It can be used from Python and it is very fast, even with large datasets. We have explored the Gramformer library in depth. We have seen how to use the library and what algorithms it uses for editing and highlighting errors. We have also seen how you can implement the library in your applications.
It is really simple to use the T5 Transformer model to fix the spelling and grammar of any input text. We hope this blog post has helped you understand how to use a T5 model to correct your text input. If you have any questions or concerns about this topic, feel free to contact us at any time. Thank you for reading, we are always excited when one of our posts is able to provide useful information on a topic like this!
Here are some useful tutorials that you can read: