Data Science Institute 2019 - A Year in Review and a Look Forward to 2020
Filed Under: DataScience
Dr. Kirk Borne sharing his vision for Data Literacy to MTSU students (9/26/2019)
Now that 2019 is in the past and we are getting ready to roar into the ‘20s, it is a good time to look back at what was accomplished and what we have to look forward to this year. The Data Science Institute was started in May 2018 and throughout 2018 planning was in place to identify the strategic objectives of the institute and what it can do to promote student development, support for faculty and research, and make an impact within our community. With the help of a MTSU advisory board, several key objectives were identified to accomplish in 2019, which revolve around projects and research, events, and education. Below selected list of projects, research, events, and educational initiatives that occurred in 2019 that will help MTSU continue to lead in the area of data science.
- The addition of Dr. Ryan Otter as a Director for the Data Science Institute
- Hack MT and the creation of Data Dives
- Kirk Borne on campus and in Nashville hosted by the Data Science Institute
- Approval of the Bachelor of Science in Data Science - to start in the Fall 2020
- Creation of the Data Science Graduate Certificate
- Murfreesboro Predictive Policing Project
- Research and external projects are going strong with over $800,000 in grant funding
The Data Science Institute adds Dr. Ryan Otter as a Director
One of the key additions in the Spring 2019 that helped propel the Data Science Institute into several research and external projects was the addition of Dr. Ryan Otter. Dr. Otter is a Professor of Environmental Toxicology in the Department of Biology. He has been involved in several big data initiatives and externally funded research projects, including the most comprehensive toxicology database in the world. Since joining the Data Science Institute, Dr. Otter has been instrumental in bringing in funding for several projects and also has a deep passion for helping to get students career ready in data science.
Hack MT starts the year off right with Using Data for Good with Second Harvest
Part of the objective for 2019 was to get individuals involved and interested in data science. To start this off the right way, the Data Science Institute connected with Second Harvest to attend Hack MT in January, to analyze their warehouse data to become more efficient within in storage and shipment of food to those in need. With the help of Frank Elmo, Director of Operations, and David Tinsley, Director of IT, data was aggregated and pitched as an idea at the 36 hour hackathon. A team of 25 students from 5 different universities tackled the data and in the end presented their findings to Second Harvest and MTSU. The results was second place overall and winning the Hacker’s Choice Award. To read more about the event, please click here.
Data Dives - Using Data for Good
After the success at Hack MT, the idea for a data hackathon specific for the Data Science Institute was proposed by the MTSU Data Science Institute advisory board by Dr. John Wallin. At one point, he termed it as a data dive, and concept and name was born.
A Data Dive is a data specific hackathon where one dataset is used by all groups to dive deep into data to answer specific objectives. This could include any application and result in visualization, story telling, predictive models, or anything else that helps the dataset owners to use their data to make better decisions.
The first Data Dive was on March 29th and 30th using the same data from Hack MT from Second Harvest. Over 75 students, 10 faculty, and 5 IT professionals attended to two-day event and analyzed the data to find trends, insight, and predictions for warehouse data at Second Harvest. This was then presented to the directors at Second Harvest at the end of the second day. This then resulted in a team that worked on Second Harvest data over the summer to provide more insight.
A second Data Dive was held on September 27th and 28th for Special Kids, Inc. The data that was analyzed was their donor data and teams were to look at how to create new donors, what makes up a sustainable donor, and what is the profile of a potential donor. This data dive was also held over two days with presentations at the end of the second day to the Director of Development for Special Kids, Stephanie Folkmann. This also led to a project after the event where a student and Dr. Apigian worked with Special Kids to create a dashboard that aggregated their activity data and showed donations within different time periods and categories.
The result of these Data Dives is not only analysis for the non-profits that are involved, but it is also an excellent opportunity for any student, faculty member or any external individual to dive into data with little knowledge or expertise and get started on their path to data science.
The next Data Dive will be our first 24 hour event starting on March 27th. We will also be partnering with VHT to provide expertise, data, and resources for the event. Should be an excellent opportunity for more individuals to get started with data and data science.
Kirk Borne at MTSU and Nashville
One thing for sure, the data science community is absolutely amazing and when Dr. Kirk Borne was asked to come to MTSU and Nashville to speak, he was gracious enough to accept. Dr. Kirk Borne is the Principal Data Scientist and an Executive Advisor at global technology and consulting firm Booz Allen Hamilton and is considered one of the most influential Data Scientists on Twitter, Kirk Borne (over 250K followers on Twitter).
Dr. Borne presented along with Dr. Charlie Apigian, Shruti Sharma (Ingram Content Group), and Dr. John Liu (Intelleron) on September 25th at the Nashville Technology Council’s Tech Hill Commons for a fireside chat about AI and implications for Nashville. It was a joint meetup with Data Nerds, Data Science Nashville, and the Greater Nashville Healthcare Analytics and over 120 IT professionals were in attendance. He then presented on Thursday, 9/26 at MTSU to several student groups and presented a talk titled “You and the Environment ... and Data Literacy”. Finally, Dr. Borne opened the Data Dive on September 27th with an inspirational talk about the importance of playing around with data.
The three days also included several meetings with individuals from Nashville, administration from MTSU, students and faculty. What was intriguing about the three day of events was that Dr. Borne was instrumental in creating the first data science program in the country at George Mason back in 2007 and we were in the final stages of approval for the MTSU data science program. It was a great way to put the final touches on the new program from a true leader in the field, but more importantly a wonderful and kind individual. Thank you again Dr. Borne - you were an inspiration to individuals at MTSU and the Nashville community.
Education Initiatives at MTSU - a Bachelor’s Degree and Graduate Certificate
Back in August of 2018, the idea of bachelor’s degree in data science was developed by an interdisciplinary group at MTSU comprised of Math, Computer Science, Economics and Finance, and Information Systems & Analytics. In 2019, this idea became a reality and will start in the Fall 2020. The curriculum was devised with the help of several industry experts and includes aspects from statistics, programming, business, and opportunities for students to dive into other industries, since data science needs to be infused into every discipline and industry.
Bachelor of Science in Data Science
The process for a new program at a public university is a arduous process with many steps that includes approvals at the university (April 2019), from an external review (July 2019), the MTSU Board of Trustees (October 2019), and finally from the Tennessee Higher Education Commission (November 2019). This is usually a 24 month process, but with the help from all parties involved, this proposal was submitted (January 2019) and approved within eleven months. A big thank you to everyone involved. Stay tuned for more information, as the official announcement will be in February 2020 and we will have students starting the Fall 2020.
Data Science Graduate Certificate
A second initiative at MTSU was to help individuals that are already educated but want to upskill for analytics and data science. Therefore, a team of faculty and a team from video production started working on a graduate certificate that will be different than anything that MTSU has offered in the past.
This includes four 7 week courses that can be completed in two semesters. Each course will include highly produced online content that includes business and data scenarios, the foundation of statistics, and how to program in Python. Students will learn online and then be required to attend a data dive event at the end of the 7 weeks to work through a data problem. (Yes, you heard that right! The idea of a data dive is now going to be included in curriculum.) The four courses are titled Data Understanding, Data Exploration, Predictive Modeling, and Model Optimization. We believe that this will be a unique opportunity for individuals that either currently work in the space and need the necessary skills to stay up to date or for individuals that are wanting to pivot into a career in data science. Are you interested? Stay tuned, more to come.
These two programs are the starting point for data science programs at MTSU. For 2020, expect a new data science track within the Computational Science PhD. Program, a new minor in data science, and a Master’s Degree will also be prosed. Also, with the help of the Data Science Institute, courses are being developed at Franklin High School and Central Magnet High School in Data Science, which gives MTSU programs at every level including high school, which allows a student in high school a path from K-12 to PhD.
Murfreesboro Predictive Policing Project
After several conversations with the Murfreesboro Police Department, a need was discovered to see if you could predict burglaries based on existing data at the precinct. With the help of their data analysts, a team consisting of one student, one faculty member from Information Systems & Analytics and the Data Science Institute team worked on aggregating data that they viewed as relevant. Through this process, 3 years of activity data was collected and demographics for each sub zone within the city limits of Murfreesboro. It was then determined that more data was needed for a reliable model and that the interval for prediction would be one week. Therefore, can last week’s call activity predict this week’s burglaries within distinct sub zones?
Call activity includes any call that comes into the police department and the result of that call. For example, if someone calls up and reports trespassing, it is entered as trespassing, and then after addressing the issue, it is entered into the system with its results or disposition, such as warning or arrest. This also includes activity such as burglar alarms, intoxication, disturbance, etc. Therefore, the target variable was this week’s burglaries within a sub zone as a 0 or 1, with 1 being one or more burglaries within that sub zone. The features for the model included:
- Call activity and dispositions from last week per sub zone
- Call activity and dispositions from last week around each sub zone
- Burglaries from last week per sub zone
- Burglaries from last week around each sub zone
NOTE: We excluded demographic data to avoid any profiling issues for the police department. All predictions were based on activity only.
Throughout the year, the model was refined and with ten years of data, the model was 82.3% accurate in predicting burglaries within sub zones. The results were then implemented into a map feature for the police to view. This project was then presented at the Nashville Analytics Summit on September 10th and at the Decision Science Institute Conference in New Orleans, LA on November 24th. You can watch the presentation from the Analytics Summit at Using Data for Good: A Predictive Policing Model - Charles Apigian, Chris Germiller - NAS2019 - YouTube.
Other Projects in 2019 within the Data Science Institute
There were several other projects that were conducted with industry partners. The first project in 2018 was with Hytch Rewards LLC. We cannot thank them enough for allowing us to get started in the Data Science Institute and for being patient with us and how to develop teams for analysis. From September 2018 to April 2019, we were contracted with Hytch Rewards LLC to dive into their activity data and look for trends and analysis. This results in three reports to Hytch and offer insight into their data that was not discovered before. If you want to make a difference through carpooling by reducing congestion on our roads and being a good steward within the community, please check out Hytch Rewards and start carpooling today. https://hytch.me/
The Data Science Institute also worked on projects within MTSU that included predictive modeling, natural language processing, and visualizations. It also partnered with several companies to figure out ways to help create a better data literate workforce.
Research projects at MTSU - over $800,000 in funding
An institute at a university would not be complete without a strong research stream, both internal and external. The premise of research within the Data Science Institute is to support data driven research with faculty at MTSU. To facilitate this, Dr. Ryan Otter, who was brought in as a Co-Director, was instrumental in bringing in projects related to the environment and sustainability.
Awarded a USDA grant for “Advanced In-Stream Water Quality Monitoring of the Red River Watershed” at $298,475. This grant is in cooperation with Frank Bailey in Biology at MTSU and the Data Science Institute. The Data Science Institute will create the data infrastructure for this project (Awarded August 2019).
Awarded a USGS grant for “Remedy and Restoration Effectiveness at Great Lakes Areas of Concern” at $98,460 per year for 5 years. This will look at data that has been collected within different Great Lakes and run models to determine if specific areas or regions are at risk. The Data Science Institute will create the data infrastructure and modeling for this project. Expected total funding of up to $500,00 over 5 years. (Awarded July 2019).
What is in store for 2020?
Expect 2020 to be more of the same with a focus on career readiness for our students and the community. With new programs, new events and projects, the Data Science Institute will be very busy making sure that they do their part to get kids ready for the world where the use of data is not just preferred, but required. Through programs that allow students to not only learn, but also do, the Data Science Institute and programs at MTSU are ready to create the future career ready data work force.
- The Data Initiative: a Detailed Summary Feb 21, 2020
- Data Science Institute 2019 - A Year in Review and a Look Forward to 2020 Jan 6, 2020
- Kirk Borne and Data Science at MTSU Sep 11, 2019
- Data Dive: Second Harvest - an Opportunity for Students to Use Data for Good Apr 2, 2019
- MTSU Hack MT reveals ‘brilliant’ tech projects, student talent Jan 29, 2019