Conf42 Python 2023 - Online

Using Python to Build Applications for Language Learning

Video size:

Abstract

We demonstrate how Python developers can use API’s to build applications and tools for language learning. In particular, we introduce the Word of The Hour platform and the Word of The Hour Python API. Finally, we hope to inspire new collaborations between language learners and Python developers.

Michael’s GitHub repo

Summary

  • Every hour a vocabulary word is posted, along with definitions and translations into over a dozen languages. The tool is supported on many different platforms. There are three core areas where we use Python. The first is word selection, the next is crowdsourcing and the third is social media posts.
  • We actually do crowdsourcing for 40 different languages. We've received over 35,000 crowdsourced translation submissions from users. Many of these submissions have been edited, modified, and keep being updated.
  • We post to about 30 different social media pages every hour across various social media platforms. We use some kind of API that allows us to interact with the social media platform. Right now, we're not actively using many Python based APIs to post to social media, but we have used some in the past.
  • Wath API enables Python developers to include the current word of the hour along with its definitions and translations within their APIs. You could incorporate this into your web apps or any type of application you have that's python based. It can be a great supplement that can help learners.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
My name is Michael Wehar, and I'm going to tell you about using Python to build applications for language learning. So let's jump right in. So I developed this multiplatform tool called word of the hour, or wath for short, and it helps you to learn words in multiple languages. Every hour a vocabulary word is posted, along with english definitions and translations into over a dozen languages. Our main goal with this multiplatform tool is basically language learning is tough, and we want to provide some simple content that you can digest on regular time intervals that will help support you and motivate you as a language learner. So as I said, it's supported on many different platforms, and we currently support web, Android, iOS, Slack, Roku, Fire, TV, Electron, and many more platforms. You can see some screenshots of what wath looks like on these different platforms, and Roku is one of our most popular platforms where there's an active screen saver. So how do we use Python? Python isn't the only language we use, but it is a very important language to us. So there's three core areas where we use Python. The first is word selection. So we need to select what words are we using to post. So we actually analyze a data set of over 200,000 words, and we run various statistical analyses to select which words should be featured by wath. So the next area where Python is really important to us is crowdsourcing. We actually have this whole system set up where users can enter crowdsourcing data into Google sheets, and we'll scrape from those Google sheets and combine all the data together, and that'll help us to provide better content in the future. So we've actually crowdsourced over 35,000 translations. And the next area is social media posts. So posting this language content regularly to relevant social media platforms is an important part of this tool. So those are the three areas where Python is really important to us. And I want to go into a little bit more detail about that. So with word selection, we start with over 200,000 words, and we have to generate, generate these relevant features about those words. And then we have to do ranking and filtering. We have various processes for doing that. And some of these features we generate are based on the frequency dependency between words, and also context of where that word appears in different situations. So in order to do this word selection in Python, we have to do some file IO. We have to actually read in these data files that contain a lot of text and language data. We have to build up these dictionaries where we basically map words to information about them, and then we have to do various sorting procedures to basically rank or filter the words. And we also use regex. And regex allows us to basically parse or detect certain patterns within the data associated with the words. Okay. And in Python, all of these things are readily available. So for crowdsourcing, again, we need to do some file I o. But let's talk a little bit about the crowdsourcing. So we actually do crowdsourcing for 40 different languages. And as I said before, we've received over 35,000 crowdsourced translation submissions from users. And many of these 35,000 submissions have been edited, modified, and keep being updated. So if you include all those updates and edits, it's many more than 35,000. And there are two languages where we've had a really enthusiastic group of users supporting the crowdsourcing, and that's Portuguese and. Cornish. But we've had many other languages with users who are really enthusiastic as well, but didn't quite submit as much as was submitted for Portuguese and. Cornish so, again, we use file IO, where basically we're reading in all kinds of past data submissions that act as a sort of basis for some of our translations. Then we make requests to our Google sheets and various sources where the crowdsourcing data has been submitted. And then we actually do some filtering to kind of verify that the crowdsourced data meets some basic quality standards. And then we actually use git to record what changes have been made and to have a sort of checkpoint we can come back to to see how the data changed at that point in time. All right, so for the social media posts, we post to about 30 different social media pages every hour across various different social media platforms. And for just about all of these platforms, we use some kind of API that allows us to interact with the social media platform. We also use the WAF API, which I'll talk about in a bit. Or we use some direct endpoints associated with WAF to get the current word and the data associated with the current word. So, actually, right now, we're not actively using many Python based APIs to post to social media, but we have used some in the past. We did have one platform where we'd actually post images using Python on an hourly or every few hour basis. And we also on Discord, we use a python bot that we actually post content hourly with. But some of our other postings to social media don't actually happen in Python, but they could. So how would we do this if there was a new social media page you wanted to post content to? Well, first, in the past we've used input arguments to kind of customize how the post will occur or where the post should be made. In Python. That's really easy for us to use input arguments. Also, we need to do some simple text operations to format our post. And various social media platforms have restrictions on what kinds of characters and what kinds of patterns are allowed to be contained in your posts. And then we use a social media API which will allow us to actually submit the post to that platform and I'll show you in a bit. But we have the wash API in Python and we can use that to easily get the current word in its data. So let me tell you about the wash API. This is what I'm really excited about to present here at this conference. The Wath API enables Python developers to include the current word of the hour along with its english definitions and translations within their APIs. You can clone our repo and import the wath API, and then you can simply just call fetch, and fetch will return you a dictionary object where there are keys, word definitions and translations to really simply get the current data. So I'm going to show a demo to you, and I hope that you may follow along and try out this demo yourself so that maybe you can incorporate word of the hour into some of your apps. So here is our public git repo on GitHub.com and you can take a look at this and whenever you're ready you can clone this repo. So I cloned the repo and I opened up the test py file. That's the code you see on the right here. And within this test py file you'll see that I first import wath API and then I call fetch. And then I have four tests here. The first test is to get the current word. The second test is to check if there's a translation into German. And the third test is to get all of those translations. And then the fourth test is to get the definitions. So let's run the code and see what happens. Okay, you can see that it actually returned that the current word is grant. We don't have a german translation, unfortunately for this word. So hopefully crowdsourcing might help us to fill in that german translation. But you can see we have translations into many other languages and we have some definitions below as well. So using the Wath API is as simple as that. You just have to call fetch and then look up the data points that you want and you could incorporate this into your web apps or any type of application you have that's python based, and you'd be able to share the word of the hour. Whether you're sharing it just to show what's the word of the hour, or you have some other kind of language tool, it can be a great supplement that can help learners. So I encourage you to try this out for yourself, and I really appreciate you taking the time to listen to this talk. Thank you so much, and I hope you have a great 42 conference. Okay, bye.
...

Michael Wehar

Professor @ Swarthmore College

Michael Wehar's LinkedIn account Michael Wehar's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways