Dive Into The Heart of Chicago's Crime



The iconic city of Chicago (Illinois), is widely known for its crime. It was surnamed 'Chirak' because of its crime rate comparable to the war in Irak. Also known for Al Capone and it's 'Chicago Outfit' gang during the prohibition, Fred Hampton's assassination by the city's police or the sudden rise of crime in 2015, Chicago's crime is an interesting subject to study.

Through this study of 20 years of crime, we want to bring insights on the crime evolution, geographical distribution, and effects on the city with an unbiased and apolitical approach to give people a fair idea of the safety of the city.

Our analysis is divided into three parts. We will first look at the crime inside the city of Chicago.
Then, based on our first analysis, we try to observe relationships between crime and aspects of life quality such as health and safety of restaurants.
Finally, we will provide two predictive models for the crime in Chicago that could be used by the police, citizens, and others. Those models aim to evaluate potential arrest probability or most likely crime given a location and a time.

Crime Analysis


The Police Department of Chicago reported more than 7 millions crime since 2001. Those crimes fall into multiples categories (eg homicide, prositution...).

Word cloud of crime
Word cloud of crime categories in Chicago

The bigger the text, the more frequent this type of crime is in Chicago.

We can see what are the most frequent kind of crime. Here is an explanation of some of them that might not be familiar for everyone:

  • Battery: "A person may be charged with the crime of battery under Illinois law if he or she makes actual physical contact with another individual with the intent to injure, provoke, or insult that person."
  • Criminal Damage: "You can be charged with criminal property damage if you: damage someone else's property knowingly, damage someone else's property recklessly by fire or explosion, knowingly set fire to someone else's land, knowingly injure someone else's pet, cause damage to property in order to collect insurance."
  • Criminal Trespass: "Trespassing on a residence, real property, or some else's vehicle is a crime in this state. You can be charged with criminal trespass if you enter a building illegally, go on someone's land after notice has been given to stay away, or stay on a property after you have been asked to leave"
  • Deceptive Practice: "If a person makes a false statement to promote the sale of property or services, pays for something with a check that the purchaser knows will not go through, makes a false statement to obtain a bank account, or possesses stolen checks, among many others infractions, that person may find himself facing charges of deceptive practices."

More information can be found in the sources. [1] [2].

In order to reduce violence, Chicago Police Department adopted new crime-fighting techniques in 2004 thanks to cooperation with the LAPD (Los Angeles) and NYPD (New York City). Let's see how crime has evolved since the Police started to collect crime data in 2001.

Indeed, crime started decreasing after 2004. However did the new techniques employed by the Police lead to more arrests?

Even though Police adopted new techniques in 2004, we see that the proportion of arrests did not increase. An hypothesis is that the new techniques are mostly for preventing crime. They reduced the number of crime rather than the number of cases closed with a criminal arrested.

Sometimes looking at the whole data at once is not representative of sub-categories. In order to see if HOMICIDE crimes have followed the same trend we will now show the homicide evolution.

It is very interesting because we can see a huge drop in 2004 that might be correlated with the new crime-fighting techniques used by the Chicago Police. However the trend is now different than when we consider all kinds of crime together. Indeed we can clearly see the rise of homicides in 2016. This sudden rise of homicides was widely cover in the newspapers in the U.S.A. This huge homicide rise in Chicago was apparently responsible for half of the cases that led to a rise of homicide in the USA for 2016.

Let's now try to have a spatial analysis of crime in Chicago. It is interesting to relate specific areas to other factors such as the number of crimes, type of crimes to better understand where violence is happening. We will split the city into community areas. There are 77 community areas in Chicago. Our first approach is to consider crime in the North vs in the South of Chicago.

Map showing north and south areas.

We cannot tell if the South is more violent than the North or vice versa. We need to split the city in smaller areas to obtain better insights. The following map displays the number of crime reported per community area and hoovering on an area will tell you what are the top 3 most frequent crimes for that area.

Top 3 crime type and crime count per community area.

We can deliver interesting insights from this map:

  • Indeed as we can see the most violent zone in term of the number of crime has NARCOTICS as its first crime type. Areas where there is drug dealing are usually linked with violence. Overall, the zones with NARCOTICS being one of the top 3 crime are zones with a bigger number of crimes than others.
  • What is also interesting is Chicago Downtown. We can find DECEPTIVE PRACTICE as one of the top crime. It is in the downtown area that you will find most of the business activities of Chicago. Thus there might be a relation between a high number of businesses and the number of deceptive practices that if don't remember are usually linked to fraud.
  • In the 'Airport area' we can see OTHER OFFENSE. It might have a link with the presence of the airport. Indeed, airports usually come with special legislation.
  • One interesting fact is that areas at the edge of Chicago usually have THEFT as a first type of crime. We can thus speculate that those areas are usually residential ones.

Let's have a look at what kind of crime those OTHER OFFENSE, in the airport area, correspond.

You can observe that the most important one is OTHER WEAPONS VIOLATION. It seems that, as it is strictly forbidden to have a gun in an airport in Illinois, this might lead to a lot of such cases. Indeed, there are generally no restrictions for other areas. [3]

We should compare the crime map with the homicide one as we realized earlier than homicide crime were following another trend than crime in general. You can see below a heatmap of the homicides in Chicago since 2001.

Heatmap of homicide in the city

This map confirms our previous analysis. As we can see the areas with a high density of homicides are usually in community areas with narcotics as one of the top crime. It suggests that violence in Chicago is related to drug dealing and more generally to criminal gang activity.
Moreover, areas without a lot of homicide cases are usually in the border of the city. Those neighborhoods are probably residential.
We can also think that some central areas without a high number of crimes are well protected. Indeed, there is not a lot of homicide cases near the Downtown or the University. Those important areas are usually well protected. As an example, it is interesting to know that the University of Chicago has its own Police Department.

We will now see if there is a relationship between violence and demographic evolution.
We thus took into consideration the population of each community area to obtain a crime rate per 100,000 inhabitants. The following map shows the crime rate per community area in 2002, 2010 and 2017. Adding the layer sequentially (using the button on the right) can help you understand how violence evolved over the past 20 years. The first layer shown is the crime rate for 2002.

Crime rate per 100,000 inhabitants evolution in chicago between 2002 and 2017

What can be seen is that the Downtown was the most violent zone in 2002 in term of crime rate. We have to remember that people usually do not live in the Downtown as this area is dedicated to businesses so the crime rate might be high for this reason. However, we can see that the crime rate for the downtown area decreased a lot between 2002 and 2010. It might correspond to a huge effort made by the Police to fight violent in the busiest area of the town.
We can, by adding layers successively that violence is spreading in the West of Chicago with community areas Austin, Humboldt Park, West Garfield Park, East Garfield Park and North Lawndale and in the South of Chicago with community areas Fuller Park, Englewood and West Englewood for example.

Now let's see how the population of each area evolved between 2000 and 2010, and between 2010 and 2017. We might see trends such as people leaving violent areas, or joining new areas. The button on the right can be used to change between 2000-2010 and 2010-2017.

Evolution of population per community area between 2000 and 2010, and between 2010 and 2017

As we can see the most violent area Austin had a huge population drop between 2000 and 2010. In general, areas that were associated with violence in our analysis correspond to areas with a population drop.
Moreover, we can clearly see that the downtown area gained population between 2000 and 2010. As we realized, the crime rate dropped in that area between 2002 and 2010 even though there were more and more people living there. It really means that the Chicago Police tried to reduce violence in the downtown area.
However, from 2010 to 2017, we can see that lots of areas are concerned with the population's decrease. Especially the violent ones and the ones in the border of Chicago. Finally, it seems that areas around downtown are still attracting people.

Quality of Life Factors


After looking at the most dangerous areas in Chicago in terms of crime rate, we will try to see if there is a relationship between the high crime rate of an area and other factors that correspond to the quality of life. We will look at the number of flu clinics and the quality of restaurants.


Below is a map with the density of clinics that gave flu shots in Chicago.

Heatmap of the clinics count per area

By looking at the geographical distribution of the clinics in Chicago, we can see that there is a bigger amount of clinics that gave flu vaccines in the Northern West part of the city. There is a clear difference between North and South by means of the number of clinics. We can then suppose that the Northern part is wealthier than the Southern part. Moreover, the wealthiest areas are probably the one near the lake as it is usually where we find the most number of clinics that gave flu shots. The most violent area 'Austin' does not have a lot of clinics even though it is close to downtown. We can think that this area is one of the poorest of Chicago on top of being one of the most violent ones.

Food Inspection

We are now going to look at food inspections passing rate in Chicago.

Pass inspection rate per community area

We can see that the violent areas are usually linked with a lower inspection passing rate. Indeed we can speculate that those areas are the poorest of the city. The areas where restaurants are passing the inspection the most are near the downtown, or at the border of the city. Those are probably the richest residential areas of Chicago.

Overall when looking at life quality factors, we can clearly see that violent areas are the one suffering from a lower quality of life. We can speculate that those areas are the poorest of Chicago. The richest part of the city are probably the ones that benefit the most from Police action and are therefore less prone to violence.

Predictive Models

ML models

Now that we have extracted meaningful data from our crime dataset, and got a better understanding of the situation, we would like to build tools that could help police and citizens by detecting patterns in the crime of Chicago.
To do so, we are modeling our dataset for different purposes using a variety of machine learning methods. The idea is to build two kinds of models:

  • Arrest prediction model: This model will try to, given the various information of a crime that is available to our crime dataset and which might correspond to the amount of information the authority would have about a recent unsolved crime, return the probability of the culprit of the crime being arrested. As an example, this model could be used to help the Police spread their forces more efficiently around the city.
  • Crime type prediction model: The goal of this model would be to give the probability of the type of crimes given again the data of the crime dataset except the crime Description feature (obviously it is highly correlated to the type of crime) and the Arrest feature. This model could be useful in several ways, for instance, the police might find it useful to know what to expect of a potential crime in a given location. This model could help Police to reduce most frequents crimes type by strengthening the patrols in certain areas, during certain periods.

In order to present those two models, we give the ability for the user to select a few parameters and observe the results of our models. The other parameters have been set such that for categorical features, it is the most frequent one, and for continuous features, we took the average. Finally, the date chosen is the latest one with sufficient data (2018). Also, the most violent area determined in our analysis is Austin area (code 25).

Arrest model

As we can observe, people arrested for Narcotics had to be caught red-handed because no-one may report drug-related crimes. Thus, our model is likely to predict that a narcotics-related crime will result in an arrest. Whereas for most of the thefts, the culprit is not arrested. This isn't a surprise as most theft reports are made by the victim after the event. We finally observe that the month of the year doesn't affect the arrest probability.

Crime type model

We can observe that for the second model, the summer months have different main crime types than the rest of the year. Therefore, during this period, the Police should adapt to this change.