This is a guest post by Emmanuel Sibanda, a Full Stack Engineer with expertise in React/NextJS, Django, Flask who has been using liblab for one of his hobby projects.
Boxing data is very hard to come by, there is no single source of truth. One could argue that BoxRec is the 'single source of truth'. However, you will only find stats on a boxer's record and a breakdown of the fights they have had on BoxRec. If you want more nuanced data to better understand each boxer you would need to go to CompuBox to get data on punch stats recorder per fights. This doesn't include all fights, as they presumably only include fights that are high profile enough for CompuBox to show up and manually record the number and type of punches thrown.
Some time back I built a project automating retrieving data from BoxRec and enriching this with data from CompuBox. With this combination of data, I can analyze;
- a boxer's record (eg. what is the calibre of the opponents they have faced, based on their opposition's track record)
- a boxer's defense (eg. how many punches do their opponents attempt to throw at them in each recorded fight and on average, how many of these punches actually land). I could theoretically breakdown how well the boxer defends jabs, power shots
- a boxer's accuracy using similar logic to above
- how age has affected both a boxer's accuracy and defense based on the above two points
- a comparison of whether being more defensive or accurate has a correlation to winning a fight (eg. when a fight goes the full length, do judges often have a bias towards; accuracy, aggression or defense)
These are all useful questions, if you want to leverage machine learning to predict the outcome of a fight, build a realistic boxing game or whatever reason, these are all questions that could help you construct additional parameters to use in your prediction model.
Task: Create an easily accessible API to access the data I have collected
Caveat: I retrieved this data around November 2019 - a lot has happened since then, I intend to fetch new data on the 19th of November 2023.
When I initially built this project out, initially a simple frontend enabling people to predict the outcome of a boxing match based on a machine learning model I built using this data, I got quite a few emails from people asking me how I got the data to build this model out.
To make this data easily accessible, I developed a FastAPI app, with an exposed endpoint for data retrieval. The implementation adheres to OpenAPI standards. I integrated Swagger UI to enable accessibility directly from the API documentation. You send the name of a boxer and receive stats pertaining their record.
Creating an SDK to enable seamless integration using liblab
I intend to continue iteratively adding more data and ensuring it is up to date. In order to make this more easily accessible I decided to create a Software Development Kit. In simple terms, think of this as a wrapper around the API, that comes with pre-defined methods that you can use, reducing how much code you would need to write to interact with the API.
In creating these SDKs, I ran into a tool; liblab, an SDK as a service platform that enables you to instantly generate SDK in multiple languages. The documentation was very detailed and easy to understand. The process of creating the SDK was even simpler. I especially like that when I ran the command to build my SDKs I got warnings with links to OpenAPI documentation to ensure that my API correctly conformed to OpenAPI standards, as this could result in creating a subpar SDK.
Here's a link to version 1 of the BoxingData API.