– German Paredes, Towson University
While working with the Neighborhood Design Center (NDC) during the past nine weeks, I have learned a lot about data analytics, and what conducting an exploratory analysis is about. The NDC focuses on providing tools, expertise, and partnerships to accomplish community visions, in addition to creating a structure for collaboration within communities. During my first week working with them I learned about the purpose of the project, which is to provide neighborhoods in Baltimore City with a customized document that presents the state of trash and environmental health conditions within their areas. To accomplish this goal, the project was focusing on answering the following questions:
- Where are the worst issues with trash in the neighborhood?
- How do issues with trash in the neighborhood compare to other nearby neighborhoods?
- What types of issues with trash are taking place in the neighborhood?
- Are there areas with fewer service requests, fewer citations, or unpaid citations?
- Are issues with trash getting better or worse over time?
Before beginning to work on the project, however, I had to learn to use three tools: GitHub, RStudio, and RMarkdown. These tools can be found on the following links:
- GitHub – https://desktop.github.com/
- RStudio – https://www.rstudio.com/
- RMarkdown – https://rmarkdown.rstudio.com/
GitHub is the site that we used to share our R code online, RStudio is the software we used to develop the code, and RMarkdown is a file type of R that enabled us to create an HTML output of our code. After getting a better understanding of these tools during my first week working with the NDC, I began to work on the code during my second week.
Prior to working on the code, I spent some time looking at the two datasets that I was going to be working with during the summer. These datasets can be found in Open Baltimore and are accessible to anyone. The first dataset was the Service Requests dataset, which indicates complains about trash such as dirty alleys, dirty streets, illegal dumpings, and sanitation property. The second dataset was the ECB Citations dataset, which measures the city’s code enforcement activities such as bulk trash, exterior sanitary maintenance, and trash accumulation. Once I had a better understanding of the information being presented by these datasets, I finally began to work on the code.
The first graph I did while working on the project was a bar graph that shows the number of citations per block in the Ellwood Park/Monument neighborhood. A block is a small group of buildings that are surrounded by roads. The purpose of this graph was to give a better idea of what areas within Ellwood Park have the most trash issues. In addition, since “blocks” are not something that we commonly use when talking about locations, a map that displays the area blocks in the neighborhood was also developed.
After creating these plots, my next step was to figure out the type, and the location of service requests in Ellwood Park over the past six months.
Once we had gotten a better idea of the state of service requests and citations in Ellwood Park, we moved to the next step of the project, which was to compare our “main neighborhood” with its nearby, or surrounding, neighborhoods. To accomplish this goal, we followed the same order of analysis that we did for Ellwood Park. First, we explored the similarities of service requests between Ellwood Park and its nearby neighborhoods, and then we explored the similarities between their respective ECB citations.
The following maps show the service requests and citation status of Ellwood Park (shown by the dashed line), and a 50-meter area block around the neighborhood that includes service requests and citations for the following neighborhoods:
- Baltimore Highlands, Madison-Eastend, McElderry Park, Orangeville, Orangeville Industrial Area, and Patterson Park Neighborhood.
Once we were done developing these prototype plots and had gotten a much better understanding of the information we were working with, and the potential results we could obtain from the datasets, we decided to create a prototype for the HTML output that will be presented to the neighborhoods. Even though we are currently working on said prototype (9th week), these are some of the newer versions of the plots that were shown above, plus a few new ones.
Once our HTML prototype if completed and our code works with all neighborhoods, our text step is to receive feedback from different communities so that we can make changes and improve our plots, maps, and HTML output.
As for the things I have learned during the past nine weeks, there are certainly a lot of them. I have gotten a much better understanding about the importance of data science in “the real world,” I have a lot more experience working with R code compared to the beginning of the summer, I am a lot more aware about the trash and environmental health conditions that neighborhoods in Baltimore are dealing with, I have learned to create maps with R, and many more.