# Designing an Algorithm (A Step by Step Guide)

Thanks for checking this out! I have held off writing an article on algorithm creation for a few reasons. The main reason is that algorithm design is not something that can be discussed in a few paragraphs or a brief conversation in email or Direct Message. Not to mention, in the normal world, there is usually too much going on with sports and my non-sports life to sit down and get into it. However, being on quarantine there is one thing we all have right now… time! So, there probably will be no better time than the present to put this article together. Because it is a lengthy topic, I will break this article into multiple parts. I don’t know how many parts it will end up being when it is all done. I will simply start discussing and see where we end. I will however keep all parts of the article on this single link & page. It’ll avoid clutter on the website and keep everything in one place. I will also make it a sticky on the Articles page, so it is always right at the top for new visitors to easily locate. Let’s do it!

Part I – Let’s Get Started!

I will move onto Part 2 of this article on Friday. In the mean time I will give you some homework to get up to speed on what I will be discussing. If you want, grab Joe Peta’s book and read Chapters 1 & 2. When you finish those chapters it will bring you to Chapter 3 titled “Cluster Luck”. Don’t get into Chapter 3 just yet. In Part II (Friday) of my article I want to break down the topics in Chapters 1 & 2 before moving to the complicated topic of quantifying luck. Then I will let you know what chapters to read over the weekend and we will reconvene here next Tuesday for Part III! The parts of the article will continue until we have developed our MLB algorithm. If you have questions along the way, please submit them using the form on the Contact page here at the website. Your questions and feedback are always welcome as it will help me put together content for this article. Happy reading!

Part II –    How It Begins

I know some of the mathematics and statistical discussions can be confusing or complex in “Trading Bases” (to avoid typing the title every time, in the future I will just refer to the “book”). It is for this reason that I am only taking a couple chapters at a time for each part of this article. I want to make sure you have plenty of time to read and even re-read the chapters. I want to make sure you grasp the key concepts that Peta covers. At the time of this writing, we are all essentially on quarantine. So, all we have is time right now. Therefore, I will take the time to proceed slowly and be sure you understand each step fully.

Every algorithm begins with a theory. The theory is the variable(s) which you feel correlates to an event’s outcome. Your goal is that your calculation will more accurately correlate to an event’s outcome compared to the sportsbook’s calculation. If you achieve your goal, your algorithm would then be “+EV” or carry a positive expected value. Peta describes in his book the “Daily puzzle to be solved in the form of the Vegas line”. I love the sentence because it’s how I’ve always looked at the design an algorithm. I saw it as me cracking the oddsmaker’s code and designing a better code to beat him. I love the competition of it all. Plus, the competition is on my favorite playing field, mathematics!

Since nobody can predict the future, we attempt to attach probabilities to a specific outcome of an event. The tried and true example for any probability discussion is the old fashioned coin toss. Technically, a coin toss is a 50/50 proposition. When you flip a coin there is a 50% chance of heads and a 50% chance of tails. However, that’s in a perfect environment and assuming a perfect coin with random environmental variables. What if the person flipping the coin accidentally dropped it before flipping it. When they dropped it, the edge of the coin got dinged. Will the ding change the way the air hits the coin and in doing so, favor one result over the other? If it does, no longer is the coin toss 50/50, rather heads may have a 51.1% chance against tails having a 48.9% chance. It’s not much, but if the sportsbook offered +100 odds on both heads and tails, now you have a +EV bet when you bet heads on this specific coin. In theory there is an algorithm for coin toss assessment. You would have to weigh the coin, measure the coin, look for imperfections, assess who is flipping the coin, how do they place the coin on their thumb when they flip it, wind, air pressure and gravity conditions, etc. All these factors can be broken down into numbers. The numbers are then weighted based on how heavily each variable correlates to the final result. A calculation or algorithm is then put together to dictate the probability of a simple coin toss. The goal of an algorithm is to take all the possible dependent variables to an outcome, find the proper weighting of the variables and use them to attach a probability to an event. I know, it sounds complex… it is! It’s why very few people take the time to try and create an algorithm and why even fewer are successful. However, I put my pants on the same way as you each day. We both breathe the same air and my college degree was not in mathematics or statistics. So, if I can do it, you can too! You just have to want it bad enough like I did!

The core of “Trading Bases”, is that Peta wanted to figure out an MLB team’s win percentage so he could bet season win totals. As part of this journey, in Chapter 2, Peta tries to figure out why the Tampa Bay Devil Rays were so strong the previous season. In an effort to breakdown the Rays success, Peta brings up the golden rule of baseball. Baseball’s golden rule is that for every 2 hits you can expect 1 run to be scored for the team. The actual number varies by season, but typically will be between 1.95 and 2.05 hits for every run scored (i.e. 2 hits per run). While the golden rule is simply 2 hits equals 1 run in baseball, Peta correctly surmises there are also other factors which dictate a teams success in scoring runs. Peta also looks at walks, on base percentage (“OBP”), slugging percentage (“SLG”) and isolated power (“ISO”). Now we are starting to build a list of dependent variables connected to a team’s run production. Once we know a team’s run production we can now begin to calculate their expected win percentage.

Now I know we want to jump right into the algorithm creation phase, but we will get to it. First, I want you to absorb the key part of these two chapters… success in baseball is correlated to a team’s runs scored. A team’s runs scored is correlated to their efficiency at the plate and the resulting hitting. I know it sounds simple and this part of algorithm development is simple. For example, whether a game goes over the total is a result of how much the teams score. I know rocket science!! However, we first need to know this initial step so we can move on to thinking about the next step… what will dictate a team’s ability to score in this game. It’s at the second layer where we sink our teeth into the meat of an algorithm’s development. Here’s another example… given the success of the horse racing algorithm, I am getting a lot of questions on how to design one. I would ask you, what dictates which horse wins a specific race? Answer: The horse who gets to the finish line in the shortest amount of time. What is the determining factor in how quickly something gets from one point (the gate) to another (the finish line)? Answer: It’s the thing’s speed. So, knowing the core of a horse algorithm is simple. You need to calculate the average expected speed of each horse in the race to figure out probabilities on who will win. Again, it’s easy knowing the final factor which dictates the outcome of the event. Now the question becomes… how do we calculate it (the final factor) with a high degree of confidence?

In the Part III of this article I will get into the formula for using MLB stats to predict a team’s expected win probability. We will then compare that probability to the sportsbook’s implied probability based on the lines they are hanging. It’s at this point we find our edges in the numbers. If our algorithm is better than the sportsbook’s, we can print money!

For Part III of the article, I would advise you to read Chapters 3 & 4 of the book. I will then have Part III of this article posted for Monday evening. Also, I will say the same thing at the end of each of these parts… if you have questions or feedback please send them over to me. Best way is using the form on the Contact page at the website or sending me an email. Sometimes others have the same question too and it’ll be good to cover.

Part III –  The Pythagorean Theorem

When I left you at Part III, I said to read chapters 3 & 4. In re-reading the chapters myself, they cover a lot of ground. I think for the purposes of Part III of this article, I am just going to cover Chapter 3. Chapter 3 discusses an important concept and I want to give that the necessary coverage and emphasis. So, let’s get into it…

I love the first part of Chapter 3 where Peta discusses analyzing retail trader information (he has access through his work at investment bank UBS). Peta uses the data to time the financial markets and make trades. Essentially, this is what I do with The Sharp Plays. I monitor public and sharp action at two global sports books and use that information in various ways/angles to give myself (and hopefully all of you) an edge. Anyway, I enjoyed this part of the book just because it illustrates another connection between financial market trading and sports market trading. Moving on…

The core of the chapter discusses Bill James and world of baseball analytics. I won’t get into Bill James a lot for the purposes here, but suffice it to say he is the father of modern analytics, primarily baseball analytics. Bill James created a formula which works across all sports. It’s known as “The Pythagorean Theorem”. The real Pythagorean Theorem is simply a^2+b^2=c^2. You may remember the formula from algebra. The Bill James version is you take a team’s runs (points, goals) scored and the same team’s runs (points, goals) allowed. You then use these two bits of data in the following formula:

RS^2 / (RS^2 + RA^2) = Team’s Win Percentage
RS = Runs Scored
RA = Runs Allowed

An easy way to do this formula in EXCEL is to lay it out as follows…

The above formula has been tested for MLB and it has been shown to be more accurate if you change the superscript from a “2” in the formula to “1.83”. When Peta ran this formula for the MLB season in the book, he found that only six MLB team’s had final results which were off by more than three wins. Given this analysis, we can now surmise that if we knew in advance a team’s runs/points/goals scored & allowed, we could accurately compute their win percentage on the season.

While the book is about MLB, the formula can be used across all sports. The superscript changes by sport, but the base formula is still the same. Yes, we can cover the superscripts for other sports later on, including the 1.83 for MLB. For now and the purposes of this article, whenever I use the Bill James formula I will use “2” for the superscript regardless of the sport I am discussing. Using “2” just keeps things standardized for the article and avoids any confusion in these early stages.

For Part IV of the article, let’s move on to Chapter 4 which discusses projecting player performance. Now, the book has 25 chapters. Don’t worry the majority of the book is how Peta performed with his model and some tweaks along the way. So, basically, once we get past Chapter 5, we will start to cover 3-5 chapters per “Part” of this article. However these early stages are the core of building a model and so I want to move slowly and step by step. I also want to allow you time to mess around with key formulas on your own, like the one above, before we move to more formulas. HOMEWORK: To prepare for Part IV on Thursday, not only read Chapter 4, but try the “Bill James Pythagorean Theorem” for yourself. Let’s look at it for another sport in the process as well. Let’s try football! The big off-season move in the NFL is Tom Brady going to the Bucs. Go out and find the Points For and Points Against for the Bucs last season (2019-20). Then… (1) when you put the data for the Bucs into the above formula, ask yourself did the Bucs over-perform or under-perform? You would do this simply by comparing the expected win percentage with their actual win percentage. (2) With Tom Brady at the helm, how would you adjust the Bucs Points For and Points Against from last season to predict the upcoming season’s performance? I will do this calculation myself and share my findings in Part IV. Hope you are having fun with this! As I said, this is a slow process and it does take time and effort to create a model. If creating a model were easy, as the saying goes, everyone would do it and make money off of it. What differentiates model makers is how bad they want it and how hard they work for their model!

DATA: Data can be found on most league websites or a website like TeamRankings.com. One of my favorites for MLB is Baseball-Reference.com and last season’s stats can be found at https://www.baseball-reference.com/leagues/MLB/2019.shtml.

Part IV –  What Makes A Model Tick

Look at you, moving right along in building an MLB model! A model which could apply to other sports as well. Before moving on, let’s quickly recap what we have covered so far…

1) Hits in baseball have a direct 2:1 correlation to runs.

2) Runs scored and runs allowed have a direct correlation to win percentage.

Now we need to calculate how many hits and thereby runs a team is projected to have in a given season. We also need to know how many runs a team is expected to give up. How can we run both of those assessments? We have to take things down to the player level. It’s at this point where we dive into Chapter 4 of “Trading Bases” titled “Player Performance Projected” and our homework from Part III. It is also at this point where I expect to lose 80% or more of my readers. Why would I lose so many people? Because, this is the point where the actual work begins and where I can no longer take you by the hand. It’s at this point you have to make decisions and engage in critical thinking on your own. For some, this will be the best part of the process. For others, it’ll be the point where they say thanks, but no thanks. Just remember, if making a profitable betting model were easy, straight forward and everyone used the same recipe… what good would the model be? It’s the uniqueness of the model which is how you find your edge. It’s why I also never share even the slightest portion my recipes. If 200 people had my recipe and began attacking the markets using the recipe, the value would be long gone and the recipe would be useless for 98% of the people. It’s at that point I would have to devise a new model to take advantage of the pricing inequalities now created by the first model and find new angles. LOL! Is your head spinning yet?? Hopefully not, let’s move on…

In Chapter 4 of Trading Bases, Joe Peta discusses how he put together player projections to calculate how many runs a team would score and how many runs they would give up. For the purposes of this article, I won’t recap what Peta does. You can simply read that in Chapter 4. Instead, I will get into making your own assessments. Now, the article here so far relates to MLB and thereby moneylines. However, I will get into spreads and totals in later parts of this article. So, those interested only in spreads, hang in there!

OK, you are now at the point where you need to assess things at the player level. You can do this by taking the data on players from the previous year and using it for the year ahead. The problem however is players age, get injured, move to new teams, styles change, etc. All of those things have an affect on the players and of course in the offensive and defensive performances of a team. So, whether you assess players based on Peta’s method (discussed in Chapter 4) or some method on your own, is up to you. In my opinion, the best tool to assess player changes year over year is PECOTA. It stands for Player Empirical Comparison and Optimization Test Algorithm. It is used by MLB front offices all over the league and is an excellent tool to assess how a player will evolve year over year. You can check the data out at https://www.baseballprospectus.com/pecota-projections/. Subscription prices to access PECOTA are very reasonable and can be used, not just in your model design, but also for fantasy baseball too.

Bottom line, for your algorithm to be based on a current season assessment, you first need to update each team’s roster for player moves (signing, trades, retirements). Once you have updated each team’s roster with the current season’s players and their previous season’s stats, now you have to adjust those stats for a new season. We now have to make assessments on each player’s performance expectation for the season ahead. At this point PECOTA comes in very handy to assess players for the upcoming season. You do not need to do individual assessments for bullpen pitchers. For bullpens, Joe Peta found in his studies that using the WAR (Wins Above Replacement) average for bullpens from the previous season is sufficient. Bullpen performance has a lot of randomness to it. So, you can use the average bullpen runs allowed from the previous season as your standard for all teams… then tweak slightly for exceptionally good and bad bullpens. To compute runs scored, you need to analyze every offensive player for hitting and quality of hitting. For runs allowed, you will assess what you expect the starting pitchers to give up combined with what you expect the bullpens to give up. Using the previous season’s stats as the base helps to dramatically minimize the work. Instead of going from the ground up on each player, you are merely tweaking their previous year’s performance. Once you cover offensive production, starting pitcher performance and have your bullpen runs allowed, you are good to go! Now you have the necessary information to plug into the Bill James Pythagorean Theorem (your estimated runs scored & allowed).

Also, just a quick side note related to MLB correlations… Peta notes that for every 10 runs a team scores in a season equates to 1 win.

ALGORITHM LAB:  I had you do some homework so we could perform a mini-assessment for this part of the article using what we have learned up to this point. To prepare for Part IV, I had you pull some stats on the Tampa Bay Bucs and make some calculations using the Pythagoean Theorem. I also asked you to do an assessment as to what you think Tom Brady will do to the stats (Points For & Against) for the Tampa Bay Bucs this season. My assessment is below…

OK, in the second row of the table, I list Tampa’s statistics from last season. Tampa scored 458 points and allowed 449 points. When we input the data into the Bill James formula from Part III, we calculate an expected win percentage of .509922. A roughly 50% win percentage will have resulted in a Buc’s record of 8-8. The Bucs finished the season 7-9 meaning they under performed based on their statistics. Funny thing is many of my followers will remember a Premium Play on Tampa -1.5 over Atlanta on December 29th. Why would this game stand out? The game opened with a Winston INT that resulted in points for Atlanta and another turnover which lead to an Atlanta field goal. After the 1st Quarter we found our Tampa -1.5 bet down 10-0! In the 2nd Quarter however, Tampa came roaring back. The Bucs put up 22 points and went into the half with a 22-16 lead. In the second half and with this being the final game of the season for both teams, it looked like both teams were mailing it in. The problem is the Falcons ended up getting 6 points in the 4th quarter (one FG was with no time left) and the game went to OT. In OT, Tampa got the ball first. Perfect! Let’s just get Tampa down the field and into the end zone! Well, first play of OT, a Winston Pick 6! Game over… Atlanta wins 28-22. UGH!!!!!! My point of this long story is Winston had another one of his Pick 6 mistakes and it cost Tampa an 8-8 record. Granted in this game the Tampa kicker also missed three field goals. However, the point remains that Winston both cost his team points last season on offense and gave up points to the other team. So, if everything else stayed the same for the Bucs, what will the removal of Winston and the introduction of Brady do for Tampa’s numbers?

I didn’t mean to confuse you by throwing football into this discussion. I did it just to show you the versatility of the Bill James Pythagorean Theorem and to use an example we can all understand.

I could and would have to write a book to properly cover projecting player performances. Only because there are so many ways to do it. However, that’s the good thing. It means there is no set way. How you develop your assessments is what makes your model unique. If you find a way to assess players that correlates properly to outcomes, you will have created a mathematical money machine! It’s going to take effort, time and working through failure.“Nothing in the world is worth having or worth doing unless it means effort, pain, difficulty… I have never in my life envied a human being who led an easy life.” – T. Roosevelt. However, when you create that model and realize you cracked a little piece of the code, it makes it all worthwhile and will start you on your model building journey. There is still a lot more ground to cover in this article. Part IV just brings us to the top of the mountain. You now have much of what you need to create a model. Now we have to navigate down the other side on how to use the model for individual games, how tweak it and how to expand to other sports.

For Part V of the article, I would advise reading Chapters 5, 6 and 7 in “Trading Bases” for Tuesday. I will have Part V on Tuesday, instead of Monday, to give you plenty of time to mess around with the above and read the three chapters in the book. We are getting ready to round the turn into the stretch run. We are almost there! See you back here on Tuesday night!

PART V – Individual Game Analysis

OK, we reached the summit of “Designing An Algorithm”. Now we trek down the other side of the mountain with our algorithm in hand. The algorithm created at this point is setup to analyze futures betting and the big picture for an MLB season. How can we now zoom in and use our season assessment of teams on a daily basis? If you really want to have a functional MLB model to assess individual games (the same theories can be carried to other sports by the way) you need to break down your data for each individual game. In designing your MLB model, I would advise having different worksheets in Excel on each team. So, you would have a Yankees worksheet, Red Sox worksheet etc. On each of these worksheets would be the team roster, pitching staff and your bullpen assessment… laid out in the format and with the data you used to calculate your full season assessments. Doing so will allow you to make easy adjustments immediately for injuries, trades, demotions, etc. throughout the season. Also, on each of the team’s worksheets, it would help to have a “Today’s Lineup” section whereby you could input the day’s batting lineup and the starting pitcher. When you setup the day’s lineup and starting pitcher, you would again use the predicted data that you calculated for each player.

At this point, using the “Today’s Lineup” section that you created, you could analyze the win probability for the team for that day, based on the lineup. You would now be able to calculate that lineup’s expected hits and thereby their expected runs scored (using the 2 hits to 1 run ratio). You would also know the expected runs allowed for the game based on the starting pitcher and your previously calculated bullpen assessment for that team. So, you have all the data you need on the team to get their Bill James Pythagorean Theorem estimated win percentage based on today’s lineup. Yes, you are using the theorem on just one game’s data, but it works exactly the same as though you were assessing the team based on the full season data. You are basically analyzing the resulting runs scored and allowed if the team played with the exact same lineup for the whole 162 game schedule. Joe Peta gets into a further explanation on this topic in Chapter 5. To avoid writing my own book, my goal is that you use Peta’s information from the book as the source for the intricate details of these topics, with my article here to fill in some blanks or expand on ideas. When you have completed your full assessment of a single team using their lineup for the day, you move on to the next team and every team playing that day. Yes, it’s a lot of work. Don’t panic, I have a shortcut coming up.

In Chapter 5, Peta has a breakdown of calculations on how to assess individual game probabilities based on your calculated team win probabilities. I will use Peta’s example of a calculated .600 winning team playing a calculated .400 winning team. Remember, before you calculate the win probabilities for the game, be sure to boost the home team’s win percentage. Peta uses an 8% boost to the home team. The edge can be calculated simply by taking the home team’s calculated win percentage as a decimal and multiplying it by 1.08. The result is the home team’s win percentage with an 8% boost. OK, so let’s break down how we get to each team’s predicted win percentage in an individual game. Peta uses the following example…

Now we take those calculations and enter them into another formula to get the team’s projected win percentage for this game…

The numbers should total 100%. If not, you did something wrong. Again, a full and thorough explanation of the above process can be found in Chapter 5 of “Trading Bases” using these calculations. If you haven’t gotten the book, I believe it is \$12.99 on Kindle and well worth the investment. It will fill many of the gaps in the article. OK, so now we know the favorite’s probability of winning the game is 69.2%. What does this mean for the moneyline? We now have to calculate this out with another formula. All these formulas can be programmed into an Excel spreadsheet so that when you enter the individual team’s calculated win percentage it automatically adds the 8% boost and calculates everything right down to the moneyline. Knowledge at least of Excel is vital to having a working model here. If you are not familiar, I would strongly advise the “Excel For Dummies” book. The “For Dummies” series is excellent despite the silly name. So, back to calculating the moneyline. Here’s how to calculate the moneyline based on your calculated implied win probability.

The formula is… Calculate a Favorite’s Moneyline based on Implied Probability –
-(Implied probability / (100 – Implied probability)) x 100 = Moneyline
A team that wins 69.2% looks like this for the moneyline calculation…  -(69.2/(100-69.2))*100= -225
What that means is this team is a value under -225.

So, if the team we calculate as the favorite in this match is -180 at the sportsbook, we have a value and we would want to bet on that team based on our calculations. There you have it! You have now taken the team’s predicted performance based on our assessment and calculated what the moneyline should be. We compare that to what the bookmaker is hanging and if the lines differ, we play the value.

What about that shortcut? Yes, it is a lot of work to adjust the lineup for each team, each day and then use that to calculate the line for every game. Granted it is the proper way to do things, BUT there is a shortcut.

The first shortcut is to have software designed for you and a data feed that does all the heavy lifting in minutes. However, the expense of this doesn’t exactly make it feasible for most everyday bettors. So, instead I would advise you to do what Peta does. First, go through and use your full season projections for each team to calculate the expected moneyline for each game. Again, if you have your spreadsheet setup, this calculation can be done in minutes using only Excel. Using the full season win percentages you previously calculated doesn’t require you to make team lineup adjustments and it can be done rather quickly. When you do this, you will narrow a ten game schedule down to 1-3 games which show a decent edge against the line. Then you will only analyze those 1-3 games further using the “Today’s lineup” assessment I discuss above. Your workload went down 70-90% in a given day just running the day’s games using season projections first and then doing the “Today’s Lineup” projections. Much easier for the everyday bettor.

You have now added individual game assessment capability to your full season algorithm. Congratulations! You have a true MLB betting model!

Some takeaways from this section of “Trading Bases”… I love in this section of the book how Joe Peta explains the low margins in MLB. People do not realize how advantageous MLB is for the bettor and how finding a dime line makes all the difference. The lower the edge the book has against you, the easier it is to beat the book! It is a great part of the book that often gets glossed over, the value of betting MLB versus other sports!

There is also a VERY IMPORTANT point in Chapter 7 that I LOVED! It’s where Peta discusses how for one of his game assessments, he didn’t think Baltimore would win because he calculated their chances of winning at less than 50%. However, because his calculation showed that Baltimore had a better chance of winning (even though less than 50%) than what the sportsbook was hanging, Baltimore was a VALUE! I wish people would understand this concept more than anything I discuss!!! VALUE is the key to betting. Most sports bettors get this belief that if a team is a value it must be a guaranteed win. No, it’s simple math. If your assessment is more closely correlated to the actual outcome than the bookmaker, you will make a fortune betting sports. Sometimes that will mean you bet a team that has a 25% chance of winning. You don’t expect to win that particular bet, but if the oddsmaker has the line calculated like the team has a 5% chance of winning, long term you will make a fortune on those situations. Of course assuming your model is better correlated to the actual results than the book’s model!

Now I know my article leaves a lot of the leg work to you. Listen, if I took you step by step I would literally have to write a book, just like Joe Peta. I wish I had that sort of time. However, Peta already wrote the book. My goal is through the combination of reading his book and this article that you will have all the tools in your hands to make a proper model.

For Part VI of this article, I would advise you to read Chapter 8 through the end of Chapter 11. It is four total chapters, but we are starting to move away from the heavy lifting and now getting into algorithm function and use. Things will begin to move faster at this point. I will have Part VI of this article posted here on Friday afternoon! I hope you enjoy!

Part VI – Risk/Bankroll Management

The most important part of any gambling endeavor is your bankroll management. In finance terms, it is referred to as “Risk Management”. Whatever you want to call it, managing your bankroll or lack thereof will dictate your success or failure. You will hear me continually say that bankroll management is more important than the bets you make. Someone possessing the greatest sports betting model ever created could find themselves bankrupt if they do not practice proper bankroll management and discipline. One of my favorite passages from “Trading Bases” is as follows (hence the highlighted, bold, red text): “It means that no matter what the endeavor, if you have an edge, a competitive advantage or a carefully constructed model with a positive expected return (+EV), you must avoid wiping yourself out with a singe bet. Never make a bet on one day that imperils your ability to exist the next day.” Basically, your betting for today should be calculated in a way that win or lose, you will be able to bet the same way tomorrow. If that is not the case, something is wrong and you need to reassess immediately!

I knew the everyday gambler had severely poor bankroll management. I have worked with dozens of sportsbooks as a consultant in various capacities. The operations I worked with were built due to the chasing and pressing of bets by gamblers. However, I didn’t realize the complete lack of bankroll control until my interactions on Twitter. Seeing people talk about being busted after losing 4-5 units on a cold run (which isn’t really that cold) or a couple units on a bad day, made me realize that some folks are lost causes. Those folks will spend the next 10, 20, 30+ years just donating to the bottom line of sportsbooks and casinos. I don’t want to see it happen to these folks, but unfortunately it’s inevitable. Sportsbooks are cash cows for one reason, people suck at managing their emotions and their bankroll. For this reason the sportsbooks devour the average bettor’s finances every year. One simple fix could dramatically minimize the damage the sportsbook does to you. Even if you still suck as a handicapper, adding the fix of proper bankroll management will limit the damage the sportsbook does to you each year.

Why don’t people take part in this simple fix? Some people are betting because they need money. So, even though they may have \$1,000 or \$5,000 to their name, they will bet \$200-\$500 a game because they have to make the mortgage payment or they need income. Sports betting or gambling in general seems like the “easy” way to achieve these goals. In reality gambling to cover money you don’t have is a recipe for disaster. Other bettors find bankroll management takes the fun and excitement out of betting. In that way, they are correct, that is what bankroll/risk management is intended to do. The goal of bankroll management is to make betting more a business than a leisure activity.

I hate to bring up a tough topic, but we learn best through our failures. On November 24th, 2019 I released a Robin Hood Selection. The play lost. What followed was the worst draw down of any wagering activity I engaged in for 2019. The Robin Hood Selection and the subsequent Robin Hood Club lost 19 units. It sucked! However, such a draw down was not out of the statistical realm. Not based on my opinion, but based on statistical analysis, a professional bettor is expected to have at least one draw down of up to 25+ units in a given year. In October, before any of the Robin Hood Selection issues happened, I posted an article by Pinnacle on the topic. You can read the article at https://TheSharpPlays.com/sports-betting-drawdowns-by-pinnacle-sports/. The point of the article is even professional sports bettors, those betting with an edge against the book, can expect to experience at least one substantial draw down annually. At the end of the year a professional bettor will still finish plus money overall, as I did, despite the 19 unit draw down. HOWEVER, the road to profitability each year won’t just be a straight line up. Similar to the price movement of a stock, the bankroll of a professional bettor will be a series of peaks and valleys along an otherwise upward trajectory. Apple stock doesn’t go from \$10 to \$200. Apple stock goes from \$10 to \$30 back to \$20, then up to \$50, up to \$80 and then down to \$40. Over that example, Apple was a great investment, moving from \$10 to \$40. What you don’t realize is along the way, Apple lost 50% of its value from its highest point (\$80 back to \$40). Sports betting profits for professional bettors follows a similar path as the Apple stock example. If you are only prepared for the peaks and not the valleys, prepare to drown when the next valley hits.

Hopefully the above has provided the emphasis as to why bankroll management is important. So, what’s the best way to manage bankroll? There are all sorts of theories out there from the standard 100 unit bankroll, to the Kelly Criterion to what Joe Peta uses. For the purposes of this article I will discuss what I do and why. However, I strongly advise you to research bankroll management strategies and find the one that best works for you!

I use the old school 100 unit bankroll with a 1 unit maximum bet. It’s simple and it is easy. Are there more effective strategies, yes, but this one works for me! My method ensures that if the 25 unit draw down hits me immediately as I start the year, before I even have a chance to accumulate house money, I will still have 75 units in reserve to keep attacking. Do I adjust my bankroll? I used to adjust my bankroll as it grew, but now I am happy where I am in bankroll terms. However, when I was looking to achieve a certain base unit wager, I would adjust my bankroll every quarter. At the end of the quarter, if I was over 110 units in total bankroll on hand, I would take the profit above 110 units. If I was under 110 units, I leave it alone. Why leave 110 units in there and not 100? I liked keeping the 10 units of extra reserve in my bankroll when I was up money. Obviously, if I have over 110 units, it means I had a good run. It also means, based on standard probabilities, I will be due for a cold run. I know eventually a cold streak will hit and that 10 units will help me weather the storm and maintain my bankroll goals. The 10 units will allow me to stay close to 100 units total when the cold streak hits. Remember there will ALWAYS be cold streaks!!!!! I don’t care how good you are! Now that I am not growing my bankroll, what do I do? At the end of the year I remove the profit above 100 units and start over with 100 units for the new year. Rinse and repeat, year after year.

Do I keep my entire bankroll in sports betting accounts? Of course not. It’s too risky. I have a special bank account which holds my reserve. I usually have 30 units on hand with sportsbooks and the other 70 units in reserve with a bank. When my combined sportsbook balance gets above 40 units, I withdraw the funds into the account which holds my 70 units of reserve. Pretty simple in principle, right? My friends, that is the entirety of my bankroll management plan. I have a wager range of 0.20 units to 1 unit, 100 unit starting bankroll each year, I take out the profits at the end of the year, maintain 70% of my bankroll with a bank and when I was growing my bankroll I would make adjustments on a quarterly basis. It’s not flashy, but it works and it will make sure you never go bust… because if you get to -50 units, it’s time to hang it up. Clearly betting is more a leisure activity than a professional activity.

In the book section of the book I had you read, Chapters 8 through 11, Peta also discusses ways you can analyze the quality of your model by breaking down what bets you are winning and what bets you are losing. I will not reiterate that here, Peta does a fine and clear job of it. So, I will leave that to him to describe to you in the book. However, I will say it is always good to analyze the results of your model to see what you are winning, what you are losing and how you are winning or losing. It allows you to often filter your model down to its sweet spot (i.e. is you model best at picking dogs, favorites, home teams, etc.).

Concluding the topic of bankroll/risk management, I cannot say it enough, but I will say it one more time… managing your bankroll is more important than the bets you make! It is the difference between a winning gambler and everyone else!

Part VII of this article will be posted on Tuesday, April 14th. I would advise you to read Chapters 12, 13, 14, 15 and 16 for Tuesday. I know, sounds like a lot, but his chapters are not that long. Plus the heavy lifting with regard to the complex concepts is now over. So, it should not be too painful. I hope, for those where it applies, you have a very Happy Easter or a Happy Passover!

Part VII – Tweaking the Model

With regard to our MLB model, the primary method of making adjustments to your model is through editing of those team roster sheets you created (discussed previously in this article). The roster sheets are the individual team pages containing the team’s roster with your predicted stats for each player. By taking the time to create those sheets before the season, you can now easily make adjustments or even expand into other statistics. I cant stress enough that all this takes time. Back 20 years ago, few bettors were doing this so it was easy to find ways to gain edges. Now, thanks to technology, bettors have sharpened their games. However, so have the books. It’s a constant cat and mouse game. Your first season will require the most work in getting your model going. Once you have a profitable model, typically the workload for future seasons is 10-15% of what you spent in the creation/season 1 phase. So, don’t worry, it does get better!

A side note for this section, if you are looking for the exponents to use for other sports (i.e. how we use 1.83 for the Bill James Pythagorean Theorem for MLB) you can visit https://en.wikipedia.org/wiki/Pythagorean_expectation. I will discuss other sports later in this article.

I would also like to add another side note. On the last page of Chapter 12, Peta talks about one of the most important things in the whole book. Peta discusses Warren Buffet’s saying that “they don’t ring a bell at the top”. Some of you reading this are young guys in their 20’s. While I hardly think I am old, I am definitely older than you. Regardless of your age, make a point to cherish every moment. It’s not easy, it takes effort, but when you look back you want to minimize the regrets you have. We will all have regrets, but this section of the book was especially poignant for me. I am avoiding writing a book report about Peta’s book, rather using it more like a textbook for this article. However, I felt this is just one of the important lessons in the book. It goes well beyond sports. I talk about my aunt all the time. She was a major presence in my life. She passed away a few years ago. As you know, my wife and I do a scholarship in her memory. Coincidentally, right now we are actually going through the applications for this year’s applicants. Anyway, my aunt texted me one day asking if I could drive her to a routine medical appointment and just generally checking in as she always did. I told her I would take her to the appointment and we chatted briefly about the news of the day. It was the last text message I ever got from her. There was no “ringing of a bell” to let me know this was the last time. She was in good health, so there was no inkling I would not have the opportunity to text with her again. It is just how life works… you never know what the future holds. I wish there was a bell. You never know and it isn’t easy, but always try to end conversations or enjoy those special moments as though you may never get to experience them again or with those same people. Be it a family trip to a favorite place, a conversation with a loved one, whatever. Don’t take for granted that you will do it right “next time” because you may not get that opportunity. Do it right the first time and work to do it right every time!

Side note, Interleague play can mess with models due to the rule changes for the teams. Peta discusses some techniques as to how he addressed these issues and I think those adjustments are spot on. I will refer you to the book and his discussions of how he adjusted his roster analysis for Interleague play.

For Part VIII, which I will post on Friday (April 17), I would advise reading Chapters 17, 18, 19, 20, 21. It’s five chapters, but most of them are brief so it should not be too heavy of lifting. Read two chapters a night and you will be good! We are slowly coming to the end of the book and thereby the article. I expect there to be about 10-12 parts to this article. Upcoming discussions include applying the model to other sports (including NFL power ratings), analyzing your model and a question and answer segment. Hope you have a good few days and see you back here on Friday!

Part VIII – Analyzing Your Model

So we have created our model, we have setup a bankroll management strategy, we have discussed tweaking the model along the way (which I will discuss further here too)… now what? Now it is time to analyze the performance of our algorithm. To me, this is one of the best parts of model design. You are taking the actual results and assessing where your model is good and where it is bad. Obviously, we want to maximize the good and minimize the bad, but how do we do it?

First, proper DETAILED record keeping is essential for dissecting all the minutiae of your model. I would suggest creating a new spreadsheet where you will log all the wagers your model suggests. I would suggest starting the following columns for each bet:

1) Date
2) Team You Bet
3) Opening Line & Closing Line
5) Was Your Bet on the Home or Away Team
6) Is the Team American or National League (even break down by division too)
7) Starting Pitcher
8) Day or Night Game
9) Result of the Bet
10) Score of the Game and the Total (with the total you can see if there is any correlations to OVER/UNDER in your bets on certain teams or pitchers)
11) First Five Innings Result and Score (It will allow you to analyze the 1st 5 Innings results for sides and totals to see about correlation there. Sometimes a model may be much better for the 1st Half than the full game or vice versa. Knowing this could prove valuable)
12) Notes (anything worth noting on the outcome of the game)

Side note to the above paragraph is a quote from Peta in the book, “Even when an investable edge exists, the profit stream is rarely smooth; in fact, it’s almost always lumpy. Proper risk management during the lean times allows for the harvesting of gains when they eventually emerge.” Love the quote because it is spot on for anything you do involving sharp or advantage betting!

Also, when Peta looked back and analyzed his model he found different analytical categories that he felt would be helpful to optimizing the data he was analyzing. For example, when it came to pitchers, Peta began to look at using xFIP and SIERA for his model. Your model will typically start by using standard analytics but as you analyze things, you will notice new angles or weights to improve the model. For example, let’s say you make an NFL model. Your analysis shows your model is strong and when you dive into what makes it strong, you seem to notice that rushing stats have a great correlation to the actual results. Perhaps you could look at advanced analytics that are connected for rushing yards and see if those would help to improve your model. There is a lot of trial and error as well as investigation and research that goes in. Remember though, if you put in the time and crack the code, your model will reward you well! You will put in the data each day and basically have a computerized money machine. How cool is that?!?! LOL!

Once again, a model is work, but it can be fun work as you review the performance analysis and you tweak the model’s analytics. Once you have gotten to this point in the model, you have put the heavy work behind you. The process of tweaking and analyzing is easy by comparison. It is something you will do every month and every year as your model rolls. Also, don’t forget to give your model a name. I just use version numbers. It allows me to track the adjustments along the way. So, when you create the model initially, make it version 1.0. When you tweak the model in minor ways it might be version 1.1. Be sure to track those changes so you can know what makes one model version different from the other. If you have a major change to your model then you move it to version 2.0 and so on. Keeping track of the changes will also allow you to roll things back to a previous version if you want or because the betting environment has changed.

For Part IX of the article, I would advise you to read through the end of the book… Chapters 22, 23, 24, 25 and the Epilogue. For Part IX, I will wrap up the book’s coverage of model creation and tie away some final thoughts for the book. For Part X, I will discuss designing models for other sports. It’s a long topic so I won’t be breaking it down as detailed as I did for MLB of course. Instead I will break down some of the major sports with things to think about for your model in that sport. My goal will be to give you the basics so you can begin to brainstorm on your football, basketball, etc. model. Your homework at the conclusion of Part X will be to send in any questions. I will then use Part XI to cover those questions and I expect to conclude the article fully at that time. We are almost there! Hope you have a great weekend!

Part IX – The Conclusion

I hope you enjoyed reading “Trading Bases”. It is a pretty easy read, it’s educational and quite fun. Anytime someone would ask me about a book on model building, “Trading Bases” by Joe Peta is the one I would always suggest. Hopefully, you now see why. While we have come to the conclusion of the book, the article will continue on for another two parts after today. Not too much to cover in terms of the final chapters for the purposes of model building, but there are some important betting lessons I will cover for Part IX here.

VINDICATION… the model had been fighting the Twins all season. When you have a model there will always be those one or two teams that will fight rational thought. The teams, despite the lack of statistical quality, will somehow come out a lot better (or worse) than they should. Sometimes this can be due to an assessment error, but assuming your model is good, usually it’s old fashioned luck. Teams and players get lucky. There is no way to fully account for luck because, by definition, luck is an unpredictable value. However, luck always and eventually wears out. Such a situation occurred for Joe Peta with the Minnesota Twins. The Twins killed the model’s profit up to August. Then in August, the Twins collapse finally came… the regression to the mean. Over the next two months and by the end of the season, the Twins action actually ended up being profitable for the model! Which means, all that luck came crashing down in the final two months. The model was right all along but luck kept the Twins propped up. In the end, the Twins paid the piper and the persistence of the model won the day.

I will compare this back to the hottest product of 2019, LOL, the Sharp Consensus. Holding a record of 42-19, the Sharp Consensus angle was an excellent performer! Do the math, it’s a 68.9% winning angle. However, many gamblers tried to make it even better! If there was a Sharp Consensus play on a bad team, they would say, I am going to pass on this one. If the play lost, now that had confirmation bias (remember that from earlier?) that their handicapping of the Top Sharp Consensus plays was a good idea. If the next time they handicapped the Sharp Consensus angle they missed a winner, well, that time was forgotten. Bottom line, the Top Sharp Consensus angle was a 68.9% winner, it’s pretty good by itself. Just roll with the wins and the losses. Take the thinking out of gambling especially when you have a winning angle! When you remove the thinking you remove a lot of the emotion of betting and that is CRUCIAL to long term success!!

The return on Peta’s model was 32.82 units and since he used a 100 unit bankroll, this is a 32.82% ROI. If this was a hedge fund, he would be the talk of Wall Street for that year. Since it is sports betting, the average bettor will say, “only 32 units in a season!” While at the same time this same bettor has never had a winning season. The goal in sports betting for any season is usually, based on a 100 unit bankroll, an 8-40% ROI. If you are a +EV bettor, you expect at least an 8% ROI. If you have had a monster season, you could see around a 40% ROI. Anything above 40% is usually not consistently repeatable and anything under 8% might have you checking whether it was just luck. The 8-40% range is usually the neighborhood to confirm that you indeed have an edge on the odds. So, any bettor starting with 100 units and hoping to have 300 units by the end of the season is slightly (I am being kind) delusional. Is it possible? Sure, you could play some wild parlays, get hot and cash in big. Unless you are betting 10 and 20 unit plays, which is insane, it’s going to be tough to pull off that +300 unit return with 1-2 unit bets. Even if you magically did, such performance is not something repeatable on a consistent/annual basis. So, instead of expecting to set records, grind those profits. Turn sports betting into an income generator, not an income eraser!

Side note to all this coverage on profits, bankroll and betting, when it came to the playoffs, Peta used the full season stats for his model assessment. Peta also realized that the top 1,2,3 pitchers in the rotation would be seen more than the #4 and 5 pitchers in the rotation. So, when he calculated his series wagers, he assumed runs allowed based on a team using only pitchers #1-#3. It’s a good way to look at it since they will get the bulk of the action. Thereby weighting your algorithm to the performance of #1-#3 is ideal to optimize the analysis. I will let Peta go into the specifics in his book but I wanted to point this out for your reference.

I am going to post Part X of this article on Friday, April 24th. I will then post Part XI, what I expect to be the final part of the article, the following Friday on May 1st. Waiting a full week between Parts X and XI will allow people to catch-up in their reading, complete all their homework and then send any questions. Which leads me to the homework. In preparation for the conclusion of the article next Friday, I would ask you to please begin to think of any questions about the article or model building. I will have you do the same thing following Part X which will be posted Friday. I will then select various questions and post them here to further the discussion as Part XI. Usually when one person has a question, others have similar questions and it will allow me to go through those items with all of you still reading this article. I believe Part XI will then be the conclusion of the article. I hope you have enjoyed the ride thus far! Start putting together those questions!

Part X – Model Development Beyond Baseball

I’ll start by tempering your expectations for this part of the article. To truly cover the topic of models for other sports would require another article, probably a book. So, my goal is, through using the above model you created as a base, to touch on how it could be applied to other sports. I will leave the heavier lifting to those of you looking to create the model. However, if you have come this far in the article, you have the work ethic and motivation to put the time in to crack the code. It should not be too hard for the wheels to begin moving for you and for you to transition to basketball, football, hockey, etc. Also, don’t forget, “The Google” is an amazing resource. I get asked questions all the time. One of them was “What is the Bill James exponent for football?” Well, if you copy that exact sentence or even similar wording right into “The Google” and press enter, the first entry is Wikipedia which explains how it is 2.37. I am always happy to answer questions and enjoy the discussions, don’t get me wrong! So, don’t hesitate to contact me. However, sometimes it wouldn’t hurt just to run your question in different ways through Google before checking in. There have been many times that people will send me a question, I Google it and give them the answer I find. LOL! Sometimes you can cut out the middle man and get a much faster response. Again, there is literally a world of information out there and sports analytics have many very good resources.

When I began writing this article, the question I would get the most is, how do you setup the model for point spread betting? I will touch on a few basics to hopefully set you on your way. First of all, the Bill James Pythagorean Theorem can be used for all major sports. The only difference is the exponent. In baseball, we use an exponent of 1.83. Research has shown that the correct exponent for basketball is 13.91, for football the exponent is 2.37 and for hockey the exponent is 2.05. So, we have the setup for the Bill James Pythagorean Theorem, now the question becomes how do we predict the points/goals for and points/goals against for our model to apply to these other sports?

In baseball, we learned that for every two hits a team gets, they will typically score one run. Does a similar relationship exist in other sports? Why yes it does! In hockey, 9% of all shots result in goals (on average). So, we could expect a team that had 30 shots to score 2.7 goals. Obviously you can’t score .7 goals but you can use 2.7 in the Bill James formula to calculate expected win percentage. Therefore, one angle could be to use player statistics to calculate the expected shots per game for a team. You could then tweak the shots on the high or low end to see how it changes a team’s performance. By fine tuning to shots and then extrapolating expected goals, you can limit the statistical variance in your calculations. You can even create one calculation to see the team’s low end of expected shots, another calculation to assess the high end. You could then see what a team’s win probability would be if the team achieves the low end of shots and if the team achieves the high end of shots. If both probabilities show value to the moneyline the book is hanging, you have a play! Obviously, we also have to work goalies into this equation but my point is you can analyze hockey similar to baseball… run it through the same probability and head to head formulas you use for baseball and get a money line calculation.

In football, if you take the yardage a team achieves on offense and divide by 15, you can get an approximation of their points. Obviously, the same applies to how many yards the defense allows. So, if you can put together a calculation to predict yards allowed and yards gained for a team, you can extrapolate points, which means you can further extrapolate win percentage (thanks again Bill James!). Now you have all the numbers you need to use the same head to head and probability calculations to compute moneylines in football. Don’t forget also, the example in Part IV of this article where I assess how Tom Brady would change the Tampa Bay Bucs performance. It is just another means of doing an NFL analysis to get points scored and allowed.

For basketball, shots and field goal percentages are the keys obviously. It doesn’t break down much further. Basketball is easier because it’s common to see shot attempts and the resulting FG% as part of standard stats posted for teams. The catch with basketball is working out a calculation to account for the variance that can occur with a team’s FG%. Unlike yards or hits, FG% can vary wildly from game to game. What determines it? Do teams tend to regress to the mean after a game they shot lights out? I tend to find that knowing the team’s FG% average range is the key and then you work from there. If a team is shooting on the low end of their average FG% and is playing a team that is predicted to shoot on the high end of their average FG%, BUT still shows value for the low % team based on the odds, you have a bet! You have a bet because in a worst case expected scenario, the under-performer is an over-performer based on the current line. The model builder will want to pull together stats, play around and work to unlock that code. I cannot give you the recipe because then everyone will have Grandma’s secret chocolate chip cookies. You wouldn’t want the recipe either, it would no longer be a secret. Play around and create your own recipe that nobody else uses and which gives you the edge on the odds! It’s a fun journey and trust me, when you get to the end and realize what you have discovered, it is like drinking from the Holy Grail after a long journey!

There are statistical relationships in all sports that are similar to those in baseball. The key is taking the time to investigate them. When you have your model setup to predict points/goals/runs for and against for other sports, then you just plug it into the Bill James formula and analyze it like you would an individual baseball game. No difference. It will give you what the probability of one team beating another which will then give you the moneyline for that game. You then wager based on the value the sportsbook’s lines provide. So,  this part of the article covers adjusting to moneyline analysis for other sports… what about the spreads and totals?

The easiest way to handle a spread model is to create team power ratings. Yes, these are the Sagarin or KenPom type ratings everyone often uses. Most sportsbooks create their own power ratings to start a season and then tweak those ratings as the season goes along. How do you create your own version of power ratings? Again, I will do my best to consolidate a long topic into a short one. There is a short cut that can use to get you going in the right direction. I’ll use NFL. In the NFL it is usually considered among the bookmaking powers that be, that the difference between the top team in the NFL and the bottom team in the NFL is 20-21 points. Now that we know what the spread should be between top and bottom, we have to fill in the other 30 teams in between. We also have to figure out what single metric or combined calculation we will use to assess and thereby rank the quality of a team. Obviously, now you have to figure out a metric(s) that you feel correlates to how good a team performs. Once you calculate that metric for each team, you rank those teams and then set the top team as 20 points of separation away from your bottom team. Now you have ranked your teams. What does this look like in practice?

Let’s say you feel “Yards Per Point” is a good metric to decide how a team performs. As discussed above and which you can see in the table below, the average for the NFL is 15 yards per point. In reality, your calculation would use a number of different variables to add up to a single number which would then be used to rank each team. For the purposes here, I am just showing how you can use a combined or single metric, lay it out in a table, set the #1 team to a power rating of 1 and the worst team to a power rating of 21. You can really use any number scale so long as you then shake out your ratings in between with top to bottom separation being a 20-21 point spread. So, let’s just use 2019 YPP for lack of a more complicated alternative. Hopefully, the below example will illustrate what I am describing in creating power ratings. So, if you did this assessment using Yards Per Play as your metric, your power ratings might look like this (using TeamRankings.com YPP for 2019)…

Obviously the power ratings example above can apply to NCAA football, basketball and the NBA. You just need to figure out how you will rank the teams and what metric to use. Then it is a matter of figuring the proper top to bottom spread and filling in the gaps in between. Yes, like any of this, it is a process, takes time, will have you experience multiple failures along the way and be a lot of work. However, imagine carrying a list of numbers on your phone (in the case of NFL Power ratings) whereby you made your weekly adjustments and now you can just do some basic subtraction, create your point spreads and walk into the book to place your bets. It’s pretty cool. You have a secret weapon in your pocket and the book will never know what hit them!

So, that was a crash course on other sports modeling to create moneylines and to create power ratings for spread analysis. The question now becomes, how do you assess totals? Well, let’s use baseball for starters. Your baseball model has an individual game roster where you calculate expected runs scored and runs allowed based on a specific lineup. Yeah, you see where I am going and yes, it can be that easy. Let’s say you calculate that the Twins will score 5 runs and allow 5 runs against the Tigers who will score 3 runs and allow 4 runs. Well, that means on average the Twins will score 5 runs but Detroit is expected to only allow 4 runs. OK, the average between those two is 4.5. We then see the Twins are expected to allow 5 runs but Detroit will only score 3 runs. It gives us an average of 4 runs. Well, we now have an expectation for Detroit scoring 4 runs and Minnesota scoring 4.5 runs. We could reasonably extrapolate that we expect this total to be 8.5 using our metrics. If the total is 9, you have a value to the UNDER. Even just a half run in baseball totals, due to their size, is a good advantage. So, I have done nothing new to the MLB model we created earlier. I just used the same data put out by the individual game analysis used to calculate moneylines and calculated a total at the same time. Just remember that totals are affected also by weather/wind and umpires, so those variables need to make their way into your calculation. However, for analyzing totals in MLB, the bulk of the work is already done by your model.

What about totals in other sports? Again, the models you created to calculate points/goals scored & allowed for the other sports, so you could calculate individual game moneylines, is all you need for a start. You then need to assess the other factors (if any) that can affect totals beyond just players (like officials and weather). However, most of your work for totals is done because the basis for your model uses a calculation whose goal is to figure out win probability based on points scored and allowed! The transition therefore to totals calculation is quite easy!

The final part of this article, Part XI, will be posted next Friday, May 1st, instead of on Tuesday as has been the usual schedule. I want to allow a decent amount of time for people to catch-up in their reading, spend time creating their model and work on everything in between. Then by Tuesday or Wednesday you can submit any questions. So, your homework for Part XI is to think of any questions you have from this article and email them to me (you can also use the form on the Contact page here on the website). I will use Part XI to cover some common questions and to wrap up everything within this article. Thank you for sticking with it for so long and I hope the information will be useful to you! Have an excellent weekend and I will see you back here next Friday night!

Part XI – Wrapping It All Up

Well, we have come to the end of our journey! I hope you have enjoyed the article. It has definitely helped pass some quarantine time for me and I hope the same is true for you. The topic of building an algorithm is something I have been asked to cover for over a year now. Usually, there was so much going on that finding the time just didn’t work out. Well, thanks to quarantine, I have had plenty of time the past month. I hope this article will help you in your algorithm adventures moving forward! Yes, I will be keeping the article permanently on the Articles page here at the website, so you can refer to it anytime.

In Part X, I asked for any questions you might have had on the process or the topic of algorithms. I received many questions along the way and have answered a lot of them in different parts of the article as it developed. So, I wasn’t sure if I would get any. However, I was pleasantly surprised that there were a few, so let me jump into those…

Do we need to use power ratings or can we just use a moneyline conversion when dealing with point spreads?

Should we be concerned about sports with a smaller sample size? Is it harder to assess football where teams only have 16 games in a season versus baseball which has 162 games?

Obviously, in any statistical assessment, the more data you have the better you’re able assess your model. However, I would not let the smaller sample size dissuade you from creating an NFL model, BUT I would be sure to temper my expectations. In baseball, when you have a model that might have had 300-500+ plays over the course of a season, you have a nice sample size to say whether your model is good or not. In football, with maybe 50 plays, you have to be prepared for the possibility that your results were just luck (good or bad). So, don’t give up after ten NFL plays that go 0-10. Look at my 1st Quarter algorithm this past year in the NFL. The algorithm sucked at the beginning of the season, but by the time the season was done, everyone was asking for the 1st Quarter algorithm each week. It became automatic! So, don’t give up, but also don’t celebrate based on the results of a small sample size. Again, creating a model takes patience, time and effort… three things many gamblers do not want to involve in their gambling (or lives). Most bettors just want to quickly come up with a play, bet money and win money and live the leisurely life. I have yet to find a successful gambler, beyond short term luck, who was profitable, but did not put in a ton of time nor pay their dues along the way. Trust me though, if you work hard at this, you will be rewarded with the coolest toy ever! You will have a model to predict events that you bet on. Doesn’t get any better!

How do we create a model for live or in-play betting? Should we allow our pre-game model assessment to play a part in how we bet in-play?

How should we split our bankroll between futures and daily betting?

I would usually say you’d want to have, at most, 10-15% dedicated to futures bets. Since futures bets are based 100% on predicted data, rendering you unable to utilize in-season data (actual performances), you don’t want to be over leveraged. You want the vast amount of your bankroll to be free to allow you to adjust your predictions as the season goes and take advantage of those actual results. You and your model will become more intelligent as you see how the players and teams actually perform that season. Therefore, you want the bulk of your bankroll free to be able to attack as you adjust.

Let’s say we have built our model to do NFL season projections and come up with win totals. How should we think about calculating odds to assess who will win a division?

Basically, when you break down the expected win percentages of each team going into the season, you would just separate the teams by division and order them by win percentage. You then have an expected order of performance in each division. Now, you have four teams in a division. If all teams were even, they would each have a 25% chance of winning the division. True odds would be +300 in such a scenario. Since all teams are not even, what do we do? There are some incredibly advanced mathematical ways to calculate the odds of winning a division. However, I fear I might lose too many people and it won’t be helpful. It requires analyzing the performance with emphasis on division games. So, let’s simplify it a little. We have already calculated our expected season win percentage for each team. We have then separated out the teams by division and ranked them in order of our calculated win percentage. I would then say to calculate how many wins that means each team will have. Your spreadsheet should look something like this…

In this example our calculated win percentage has Miami at the top of the AFC East with a calculated win percentage of 65%. It means Miami is theoretically expected to win 10.4 games, Buffalo next with 8.8 wins, NY Jets with 5.6 wins and New England with 2.4 wins. Clearly, our model shows Miami head and shoulders above the rest of the teams with a 1.6 win advantage over Buffalo. In a normal scenario, the average price for a division favorite is +120. To calculate odds for the other teams you then move that +120 price up roughly \$0.50 for every 0.75 wins the top team is calculated to be ahead of the next team. So, in the example above, Miami would be roughly +120, Buffalo +220 (1.6 wins behind = roughly +\$1.00), NY Jets +440, New England +653. Again, this is not a perfect calculation, but it is a much faster shortcut to put you in the ballpark for a valid assessment. If you calculated the above performance for the AFC East and your book had Miami +180, obviously that’s outside a 50 cent variance (minimum I would consider using this method) and you have a value!

How do we go about creating a golf model?

Well my friends, that’s it! The article “Designing An Algorithm (A Step by Step Guide)” has come to a conclusion. I wish all of you the best of luck in your model development and in your action. I am sure in the months ahead I will have other articles that will connect back to this one. Obviously, algorithms and model building are key to what I do. So, the topic will definitely continue into the future in various ways. Thanks for following along with me here and if you have questions, you know I am happy to help! Just send an email or use the form on the Contact page here at the website. Stay safe & healthy! Good luck in your model creation journey and in your action!

THE END