Spring 2024 EL/R Project Offerings
New Projects
AI Energy Optimization - Asif Imran
Openings (4)
Description
We aim to design and develop a tool that will analyze popular ML frameworks like pytorch, tensorflow, dusk, pandas, etc and will be able to calculate gather data at the method level to estimate energy consumption and carbon emission for big data workloads leveraging those frameworks. Next, we would like to us ML itself in the tool to suggest how making changes to the structure of the code can improve the energy usage for the same workload. Overall, we are planning to design an automated refactoring tool that will refactor from the perspective of energy optimization and not only code quality. The team needs to have knowledge on software engineering processes like requirement engineering, software quality assurance and testing, python language, abstract syntax trees (AST) and software documentation. We can divide the project in the following phases:
- Phase 1: Identify the attributes in ML frameworks which contribute to energy consumption. These can be specific methods, classes, etc which consumes significant energy during execution.
- Phase 2: Build a software that will automatically execute workloads on the ML frameworks and collect energy data at the GPU level. Gather this data ta the method level.
- Phase 3: Use ML to train the software to suggest refactoring of ML frameworks to optimize energy usage.
- Phase 4: Execute automatic refactoring of the codes based on the feedback of phase 3.
Adding FM-Index to the Coriolis Metagenomic Classifier - Jaric Zola
Openings (4)
Description
Metagenomic classification is a process in which we try to identify DNA. Consider a biological sample. Perhaps it has been taken from a patient with some unknown disease, or maybe it comes from a lake whose ecosystem is collapsing for some reason. We can extract all DNA from such a sample, and then sequence it (DNA sequencing converts every biological DNA molecule into a corresponding string of ACGTs that we can store in a text file). Now the question becomes: given a set of DNA strings that we got from the sequencer, what are these DNA strings? From what organisms do they come? If we can answer that, we may be able to identify a culprit that makes our patient sick, or causes the collapse of our lake. This is the job of metagenomic classifiers, software tools that do metagenomic classification.
The SCoRe Group has developed and maintains Coriolis – a metagenomic classifier written in modern C++ and tailored for mobile devices. The key element of the classifier is a clever way to maintain the so-called reference database. The reference database is a large database of known DNA sequences (for example, we know ACGTs that make DNA of human, E. coli, and thousands of other organisms). By comparing unknown DNA to the sequences in the reference database, we can identify an organism from which the unknown DNA most likely comes.
In this project, we want to extend Coriolis such that it can use FM-index. FM-index is a clever data structure designed for indexing large texts. To realize this project, your team will first learn basic string processing algorithms. Next, you will implement a simple program using libsdsl (https://github.com/simongog/sdsl-lite) so that you can confidently implement FM-index creation and querying. Finally, you will take this experience to the next level, and you will implement FM-index support directly in Coriolis.
LED Video Wall - Nicholas Myers
Openings (6)
Description
The CSE department is looking for an innovative digital signage solution to impress potential new students and modernize our facilities. This new project will allow the creation of affordable LED Video Walls that can be custom-built to perfectly fit any location.
Our goal for the first semester is to: - Develop an efficient protocol on top of TCP for updating thousands of LEDs at a high frame rates - Create an Arduino (C++) client for microcontrollers that will connect to a server and control multiple addressable (WS2812B-based) LED matrices according to our protocol - Create a Python server application which uses the protocol to send LED pixel updates to multiple microcontrollers according to a configuration that defines how the video wall is physically arranged - Physically build and wire a roughly 3x3 foot video wall for demonstration purposes - Display static image files on the video wall
Skills
You don’t need to know everything coming into the project, but you should be excited to learn! Python, Arduino C++, Minor electrical knowledge (simple circuits)
Encrypted Storage Resistant to Compromise - Marina Blanton
Openings (3)
Description
The core of the project will be developing a system for encrypted storage where the decryption key is split among multiple devices and compromise of a device with a partial key does not lead to disclosure of the encrypted content. The cryptographic component will be written in Rust and thus the project involves learning Rust programming.
The long-term goal includes developing user interfaces for major operating systems and major types of mobile devices to permit a user to store partial keys on different devices. Additional features will include provisions for re-keying in the event of a device compromise as well as key backup and recovery.
As this involves development of security software, the project will follow best security development practices with code review and security-related testing.
Skills:
Exposure to security concepts and/or cryptographic libraries is a plus.
Spreadsheet Evolved - Oliver Kennedy
Openings (4)
Description
Spreadsheets are really powerful programming tools. There is a lot you can do with them. But... they're also kinda stale. It's hard to make a spreadsheet "look good", and there's a lot of ways in which an infinite grid of cells is just painful to work with. There is a better way! Apple's Numbers takes the basic spreadsheet formula and revisits a lot of assumptions: Instead of a grid of cells, you get tables, you get UI widgets... you have an entire blank canvas on which to create actual interfaces. You get a formula entry tool that gives you human-readable formulas. Unfortunately, there's only one place you can get Numbers, and it's getting harder and harder each year to be a Mac user.
The goal of this project is to design and build an open-source adaptation of Apple Numbers, and eventually to extend it into something even more powerful and scalable.
Semester 1 goals:
- Select a programming platform (Rust, Scala, or Ocaml preferred; Something that allows common front/back-end code ideal)
- Design and implement a data model for spreadsheet 'tables' and spreadsheets (e.g., a CRDT or log-based view)
- Implement a front-end canvas, data table widgets, and tabs in a reactive web programming environment
- Time permitting: Implement a websocket server to replicate the frontend data model to/from the server.
Unfold Studio - Christopher Proctor
Openings (4)
Description
Unfold Studio is a web application for reading and writing interactive stories, used in middle- and high-school CS and English/Langauge Arts classes in several US states and overseas. Unfold Studio allows users to write interactive stories in a programming language called Ink, and to interact with one another via common social media affordances.
Unfold Studio is set to be used for a national AI writing competition and the team will be focused on ensuring the infrastructure is ready for it. This includes performance and scalability testing, deployment, and content moderation. The project will also involve building relevant features, both on the front and back end, and studying user interactions and implementing analytics. Other tasks would include bug fixing, code refactors, maintaining compliance with child privacy laws, and technical documentation.
Unfold Studio (https://unfold.studio) is a web application for reading and writing interactive stories, used in middle- and high-school CS and English/Langauge Arts classes in several US states and overseas. Unfold Studio allows users to write interactive stories in a programming language called Ink, and to interact with one another via common social media affordances. Unfold Studio is set to be used for a national AI writing competition and the team will be focused on ensuring the infrastructure is ready for it. This includes performance and scalability testing, deployment, and content moderation. The project will also involve building relevant features, both on the front and back end, and studying user interactions and implementing analytics. Other tasks would include bug fixing, code refactors, maintaining compliance with child privacy laws, and technical documentation.
Skills:
Experience in Python. Knowledge of Django and Postgres is a bonus. Any experience with performance profiling, scaling, and DevOps is also appreciated
Existing/Continuing Projects
Find a Mechanic - Josh Khreis
Openings (4)
Description
Find a mechanic of FAM for short is a service based webpage and app with the goal of becoming a doordash and/or instacart of trades and services. As of right now we’re focusing on the mechanics trade due to it being the largest and having year round work.
This software has 3 core portions. A marketplace webpage for a shop to administer and delegate time, work, and resources internal to customer issues and communicate and do the entire workflow process with customers. The other two portions are an app that has two uses. 1st use is for a worker of the shop to have work delegated to them from the shop webpage that allows them to update on progress of a vehicle and a user facing app that allows them to find a shop of their desired niche by reviews. The user can interact with the shop from start to finish from the app.
As of right now FAM is nearly finished as its most basic core tenets. It still needs more features and some things built into it listed below.
- Changes/builds depending on customer feedback
- UI improvements (consider using UI library)
- Build/expand database (better reporting and analytics)
- Add to the calendar
- bug/glitch fixing
- Securing and consolidation of the code
- potentially adding payment system
This is what we’re planning on for the foreseeable future as we attempt to start demoing with various shops and students to test out. As more issues trickle in we believe those will help build a better process.
Our tech stack as of right now is AWS, React Native, PHP, MySQL,CSS3, and Expo.
We have a small team willing to help and guide in all aspects. We’re willing to help on all aspects including making sure our team is taken care of. We are also willing to consider all ideas that can potentially fix or better the software.
Material Microstructure - Olga Wodo
Openings (6)
Description
Over the ages, new materials have driven innovation and shaped our civilization. The names of the main prehistoric phases of human history, e.g., the Stone Age, Bronze Age, Iron Age, etc., are the testimony to this statement. The progress has been paralleled by a better understanding of the relationship between (micro)structure and material properties that led to the plethora of materials (e.g., reinforced concrete, LEDs, graphene, organic semiconductors). In recent years, progress in materials research has been fueled by machine learning (ML) and artificial intelligence (AI). To streamline the transition, the information about the materials' structure needs to be converted into an ML-readable format. GraSPI is a software developed at UB that featurizes the micrographs into an array of physically meaningful descriptors that can be directly used in ML pipeline.
In this project, students will enhance an existing GraSPI project. GraSPI uses a graph/network as a data structure to efficiently calculate descriptors from materials micrographs. The project's current version is coded in C/C++ using the boost library. This project aims to translate the current implementation to be python-native (e.g., using NetworkX) or use a boost.Python library. Part of the project will be: - to identify the best strategy for the needed translation (research solution, plan and execute the basic tests with classic graph-based algorithms, and make the suggestion); - plan the translation between two packages - implement the Python packages (core functionality, documentation, example notebooks)
Skills
Python programming experience and modern development tools (such as GitHub and Linux).
Fresumes AI Mass Messaging - Matt Morgan
Openings (6)
Description
AI mass messaging/robo recruiter.
We have employers paying monthly subscriptions to send automated messages to job seekers. We want AI to improve this feature by:
Identifying talent to contact Writing customized messages Interviewing talent (text message interviews to start & video interviews with ai avatar if we have enough time to build)
pgAQP Plugin - Zhuoyue Zhao
Openings (6)
Description
The overall goal of the project is to create an open-source experimental PostgreSQL plugin called pgAQP for approximate query processing in hybrid transactional/analytical processing systems. Approximate query processing is a new database query processing technology that allows one to query a large database with significantly lower latency with a small trade-off in query answer accuracy. The plugin will be used for research, evaluation and education purposes, and will enable any existing PostgreSQL database to add approximate query processing support.
The immediate goal of Spring 2023 is to migrate a special database index module, which is currently a research prototype built inside PostgreSQL 13.1, into a standalone library, so that this index can be used in newer versions of PostgreSQL system in the future. The code migration effort has been partially completed by a departing student in our lab. The EL/R students will learn the necessary background knowledge, read the code base, write API wrappers, and write test cases.
Skills
SQL GoogleTest framework
Autolab - Jesse Hartloff
Openings (10)
Description
Students can join the ongoing DevU project which has the goal of replacing AutoLab. DevU will be built from the ground up here at UB with extensibility as a core design philosophy. The app will also have a full featured API allowing instructors and student to write their own software that will interact with DevU whenever they want a feature that has not been implemented in the project itself.
Short-term (1-semester) goals include building the autograding pipeline, maintaining the gradebook for each course, and building out the front end.
Indoor Wireless Localization - Roshan Ayyalasomayajula
Openings (6)
Description
Indoor Wireless Localization (specifically Wi-Fi) has become quite widespread, there is an active Task Group 802.11bf as well to bring this into standards and there are many learning-based solutions to bring localization accuracy to sub-meter level. While this is great news for indoor automation, there are deployability concerns in terms of generalizability and privacy concerns surrounding it. While the most popular approach is to use data-based ML-models to provide sub-meter accurate localization algorithms, there are still requirements for generalization to different environments and better accuracy.
During this semester, the students will use the web-app to get data from Wi-Fi routers from a couple of smartphones. The data is then processed online to pass through an ML-model. While there is a baseline ML-model, they should develop and improve the generalizability and accuracy of the localization algorithm (which would be almost similar to object detection problems). The next stage of the project would be to develop quantifiable differential privacy for these learning models, by adding quantifiable noise to the input images.
Skills:
Online Data streaming, Cloud Data Storage, Online Processing, Machine Learning, Signal Processing
Choreographic Programming - Luke Ziarek and Andrew Hirsch
Openings (4)
Description
Choreographic programming aims to simplify distributed systems / concurrent systems programming by allowing a programmer to specify the system in its entirety - participants, their computations, and communication patterns. From this high level choreographic program, we can synthesize concrete implementations for each node in the distributed system or each thread in the concurrent system.
The team will be responsible for expanding the capabilities of our compiler for a choreographic language. Students will start by learning choreographic programming and write test cases and sample programs in the language. Students will then build in syntax highlighting for both emacs and visual studio. Students will also have the ability to work on the back end, creating compilation support targeting LLVM, C, and/or native code.
Teamwork Tool Development - Matthew Hertz
Openings (4)
Description
Enhance the current UI/UX of the teamwork tool (https://cse.buffalo.edu/teamwork) which allows instructors to create and administer peer and team feedback for a variety of course needs. This will also include several feature enhancements to allow the tool to be used across more courses. Skills Web Development