Post thumbnail
Published on
ยท 7 mins ยท Twitter

Why you should use Django for data science

If you prefer video format, you can watch the video version of this tutorial here:

What is Django?

A web framework built with Python ๐Ÿ

Already in that statement, there are a few things that can be discussed.

A web framework is a set of tools that allows you to automate a lot of the common work. In this case, because Django is a web framework, a lot of the internals of Django are focussed on automating common aspects of web development such as authentication and database connections.

Free and open-source ๐Ÿ“–

You can go on GitHub and take a look at the source code behind Django. The Django project is a very loved project in the open-source community, having just under 55k stars!

It encourages rapid development ๐Ÿƒโ€โ™‚๏ธ

Django is designed in such a way that it allows you to build your projects as quick as possible. So you can take an idea and build the project in a matter of days or even hours if the project is small enough.

It's built by experienced developers ๐Ÿ’ป

The Django project was started in 2005 and had a team of core developers working on the project. Those developers made sure that the framework accounted for things where most developers would make mistakes.

It's focused on security ๐Ÿ”

This is something that not all open-source projects focus on. This also tends to be the reason why large companies traditionally preferred to use paid software because it guarantees a certain level of quality. Fortunately, Django is still managed by the Django Software Foundation and has a strict process of alerting the team of potential security issues. When there are issues, the team is quick to release new versions with a fix.

It's proven to be scalable ๐Ÿš€

Later in this post, we'll take a look at some of the companies using Django. Just as an example, Instagram uses Django. That should be enough proof that if Django is used by Instagram then it should be scalable enough for your projects too.

Why you should consider Django (in general)

These are general reasons as to why you should consider using Django in your projects.

There's an abundance of learning resources ๐Ÿ“š

In the last few years, there have been hundreds of videos, tutorials, and books made that teach Django. There's no shortage of learning resources. This is a huge strength because it's a lot more difficult to learn something when there aren't that many resources.

Django is an opinionated project ๐Ÿคซ

This reason is debatable but I believe it's a good reason to use Django. When a project is opinionated it means that throughout the development of the project, the developers had to make decisions that pushed the project in certain directions. After a long period of time, those decisions have transformed the project into something that is constrained by those decisions, normally making it difficult to work outside of those constraints.

But for beginners and for people who are looking to use Django more as a business tool, this is the exact reason why you would choose Django - because the developers made all the decisions for you.

It's built with Python ๐Ÿ

I've mentioned this already but it's actually such an important reason. Being one of the most popular programming languages, Python makes Django a really viable option for web development.

Particularly if you're already working in the data science ecosystem of Python, it makes a lot of sense to consider Django instead of learning a completely new language and its libraries.

It has a large community ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ง

This ties in with the abundance of learning resources. There are so many people writing tutorials, making content and joining the Django conversation. Now is the time to join and gain the benefits of an active and friendly community.

There are thousands of packages available ๐Ÿ“ฆ

In Django, you can write your own third-party app and then install it in your projects. At this point, there are thousands of Django packages available which you can find here.

There are packages for handling authentication with Google, querying API's, handling payments with Stripe, and much more!

Django for data science and machine learning

In my experience, Django has been a really great choice for data science projects.

This is a common scenario I've found in freelancing, and in having conversations with other developers; typically in companies that are working with data, whether it's a service they provide to their clients or it's for internal tools. Traditionally the company starts by using Microsoft Excel. Over time those companies want their data to be more flexible and want to start doing things that are more demanding than what traditional spreadsheets can provide. They'll then start looking at custom solutions to improve their processes.

https://assets.justdjango.com/media/podcast_thumbnails/company-transition.png

Naturally, they will transition into the Python eco-system (not all the time, but quite often if they want complete control). And if you're doing any kind of work with data in Python, you're most definitely using Pandas, Numpy, Jupyter Notebook and related technologies. This already adds a lot of value to the company. But sometimes they will want to improve the accessibility of the information. A good example of this might be quick high-level analyses that any person can quickly pull up, instead of sending an email or asking for a CSV file.

They will start looking at bringing their data projects to the web. This is where Django comes in. Because the project is already in the Python data science ecosystem, Django is going to be the first option to consider when building the project as a web application. Here's why:

Again, Django is built with Python ๐Ÿ

This really is an important reason because it means you don't need to change technologies. Coming from a data science project using Pandas and other Python libraries will be really simple to integrate with Django. Compare this with moving to something like React and ExpressJS - you'd have to start from scratch.

Django is batteries included ๐Ÿ”‹

Django is a framework that is said to be "batteries included" which means that it is a self-sufficient project. It comes out-of-the-box ready to use with everything that is needed. Django includes functionality for authentication, connecting to a database, rendering HTML and so much more.

This is a huge advantage because you can make as few decisions as possible when it comes to building the web application. You can focus entirely on the data science aspects of the project and not have to worry about how the database works or if everything is secure enough.

This allows you to get your project up and running as quickly as possible. Contrast this to working with another popular Python web framework: Flask. In my experience developers that start with Flask tend to become overwhelmed by the amount of decision making that is required of them. Eventually, they land up switching to Django and rebuilding the project in a matter of days compared to weeks with Flask.

Django lets you solve business problems ๐Ÿ’ผ

At the end of the day, this is what we're aiming to do. Normally companies and freelance clients are not as concerned with the actual code behind the product - they're more focused on how well the product solves the problem.


So when it comes to data science and machine learning projects, Django is a fantastic, if not the best choice for taking your project into the web.

With Django, you get to work rapidly, focus exactly on your business need, and not make decisions for things you aren't qualified for.

Companies using Django

Sentry

Sentry is an error logging software. In 2018 they handled over 400 billion events! Sentry is a Django project and is also open-source, so you can find the code here.

Udemy

Udemy is a popular course selling website that has over 130000 courses and millions of students. Their website is also built with Django.

Instagram

Everyone knows Instagram. It's really inspiring that such a large company is using Django.

Robinhood

Robinhood is an investment app that uses Django on the backend.

OpenEDX

An educational platform that also uses Django


These are all big companies so they prove the point that Django is a scalable technology. But Django is also being used in smaller businesses and by individuals. It's a great framework being used by many different kinds of businesses.

Learning resources

The usual places to find free learning resources:

A paid option is https://learn.justdjango.com which provides a structured syllabus for becoming a professional Django developer.

Want to talk about this post? Get in touch with me