150 Machine Learning, Statistics, and Maths Articles

Home | About the Author | Newsletter | Our Catalog | Free Books | Contact Us

You will find here articles and tutorials that I published between 2017 and 2021, covering original, off-the-beaten-path content in machine learning, operations research, statistics, dynamical systems, mathematics and related topics. The emphasis is on applications, the style is compact, and many illustrations are provided. Concepts are explained in simple English, avoiding jargon and arcane theories.

Orbit of one instance of the sine map

To receive updates about new articles and eBooks, sign up for our newsletter, here. The most recent material is available here. Two of my eBooks, available for free, can be accessed here. See my bio, here. Besides my Data Science Central articles listed below, I also invite you to read my posts on MathOverflow, StackExchange, and CrossValidated.

Here is the list, broken down by category, and in reverse chronological order.

1. Core Articles


  1. Simple Machine Learning Approach to Testing for Independence
  2. An Easy Way to Solve Complex Optimization Problems in Machine Learning
  3. Introducing an All-purpose, Robust, Fast, Simple Non-linear Regression
  4. Variance, Attractors and Behavior of Chaotic Statistical Systems
  5. New Family of Generalized Gaussian Distributions
  6. Gentle Approach to Linear Algebra, with Machine Learning Applications
  7. Confidence Intervals Without Pain
  8. Re-sampling: Amazing Results and Applications
  9. How to Automatically Determine the Number of Clusters in your Data - and more
  10. New Perspectives on Statistical Distributions and Deep Learning
  11. A Plethora of Original, Not Well-Known Statistical Tests
  12. New Decimal Systems - Great Sandbox for Data Scientists and Mathematicians
  13. Are the Digits of Pi Truly Random?
  14. Data Science and Machine Learning Without Mathematics
  15. Advanced Machine Learning with Basic Excel
  16. State-of-the-Art Machine Learning Automation with HDT
  17. Tutorial: Neutralizing Outliers in Any Dimension
  18. The Fundamental Statistics Theorem Revisited
  19. Variance, Clustering, and Density Estimation Revisited
  20. The Death of the Statistical Tests of Hypotheses
  21. 4 Easy Steps to Structure Highly Unstructured Big Data, via Automated Indexation 
  22. The best kept secret about linear and logistic regression
  23. Black-box Confidence Intervals: Excel and Perl Implementation
  24. Jackknife and linear regression in Excel: implementation and comparison
  25. Jackknife logistic and linear regression for clustering and predictions


  1. New Stock Trading and Lottery Game Rooted in Deep Math
  2. Time series, Growth Modeling and Data Science Wizardy 
  3. How to Stabilize Data Systems, to Avoid Decay in Model Performance
  4. 22 Differences Between Junior and Senior Data Scientists
  5. The First Things you Should Learn as a Data Scientist - Not what you Think
  6. Difference between Machine Learning, Data Science, AI, Deep Learning, and Statistics
  7. 21 data science systems used by Amazon to operate its business
  8. Life Cycle of Data Science Projects
  9. 40 Techniques Used by Data Scientists
  10. Designing better algorithms: 5 case studies
  11. Architecture of Data Science Projects
  12. 24 Uses of Statistical Modeling (Part II)  | (Part I)
  13. The ABCD's of Business Optimization
  14. What you won't learn in stats classes
  15. Biased vs Unbiased: Debunking Statistical Myths

2. Blog Posts About Data Science


  1. Defining and Measuring Chaos in Data Sets: Why and How, in Simple Words
  2. Hurwitz-Riemann Zeta And Other Special Probability Distributions
  3. Maximum runs in Bernoulli trials: simulations and results
  4. Moving Averages: Natural Weights, Iterated Convolutions, and Central Limit Theorem
  5. Amazing Things You Did Not Know You Could Do in Excel
  6. New Tests of Randomness and Independence for Sequences of Observations
  7. Interesting Application of the Poisson-Binomial Distribution
  8. Alternative to the Arithmetic, Geometric, and Harmonic Means
  9. Bernouilli Lattice Models - Connection to Poisson Processes
  10. Simulating Distributions with One-Line Formulas, even in Excel
  11. Simplified Logistic Regression
  12. Simple Trick to Normalize Correlations, R-squared, and so on
  13. Simple Trick to Remove Serial Correlation in Regression Models
  14. A Beautiful Result in Probability Theory
  15. Long-range Correlations in Time Series: Modeling, Testing, Case Study
  16. Difference Between Correlation and Regression in Statistics
  17. One Trillion Random Digits
  18. New Perspective on the Central Limit Theorem and Statistical Testing
  19. Simple Solution to Feature Selection Problems
  20. Scale-Invariant Clustering and Regression
  21. Deep Dive into Polynomial Regression and Overfitting
  22. Stochastic Processes and New Tests of Randomness - Application to Cool Number Theory Problem
  23. A Simple Introduction to Complex Stochastic Processes - Part 2
  24. A Simple Introduction to Complex Stochastic Processes
  25. High Precision Computing: Benchmark, Examples, and Tutorial
  26. Logistic Map, Chaos, Randomness and Quantum Algorithms
  27. Graph Theory: Six Degrees of Separation Problem
  28. Interesting Problem for Serious Geeks: Self-correcting Random Walks
  29. 9 Off-the-beaten-path Statistical Science Topics with Interesting Applications
  30. Data Science Method to Discover Large Prime Numbers
  31. Nice Generalization of the K-NN Clustering Algorithm -  Also Useful for Data Reduction
  32. How to Detect if Numbers are Random or Not
  33. How and Why: Decorrelate Time Series
  34. Distribution of Arrival Times of Extreme Events
  35. Why Zipf's law explains so many big data and physics phenomenons


  1. Some Irresistible Integrals, Computed Using Statistical Concepts
  2. Curious Mathematical Problem
  3. Another Off-the-beaten-path Data Science Problem
  4. Two More Math Problems: Continued Fractions, Nested Square Roots, Digits of Pi
  5. Mathematical Olympiads for Undergrad Students
  6. Difficult Probability Problem: Distribution of Digits in Rogue Systems
  7. Little Stochastic Geometry Problem: Random Circles
  8. Question: Correlation Coefficient in Flat Line Model
  9. Question about Some Statistical Distributions
  10. Coefficient of Correlation for Non-Linear Relationships
  11. Paradox Regarding Random (Normal) Numbers
  12. Curious Mathematical Object: Hyperlogarithms
  13. 88 percent of all integers have a factor under 100
  14. Math Challenge: Computing the Average Rotational Speed of Earth

Business and General

  1. Common Errors in Machine Learning due to Poor Statistics Knowledge
  2. How to Lie with P-values
  3. Growth Modeling for Business Managers and Executives
  4. Unexpected Use of AI: Solving Complex Mathematical Problems 
  5. 8 Tips to Leverage Analytics: Advice for Small (and Big) Businesses
  6. Four Types of Data Scientist
  7. New Directions in Cryptography
  8. Black Hat Data Science
  9. From Petabytes to Nanobits, with Application to Blockchain
  10. Preventing Cambridge Analytica and Others to Hack into Facebook Data
  11. Interesting Application of the Zipf Distribution: Data Purging
  12. 22 tips for better data science
  13. Machine Learning Algorithm to Trade Bitcoin
  14. How Mathematical Discoveries are Made
  15. How to Solve the New $1 Million Kaggle Problem - Home Value Estimates
  16. Detecting Fake News, Fake Reviews, Fake Accounts, Fake Pictures
  17. 10 Data Science, Machine Learning and IoT Predictions for 2017
  18. Modern Computational Advertising on Social Networks: The Basics
  19. Building an Algorithm to Break Strong Encryption
  20. Why so many Machine Learning Implementations Fail?

3. Other Blog Posts


  1. More Surprising Math Images
  2. Beautiful Mathematical Images
  3. Deep visualizations to Help Solve Riemann's Conjecture
  4. Spectacular Visualization: The Eye of the Riemann Zeta Function
  5. New Probabilistic Approach to Factoring Big Numbers
  6. Simple Trick to Dramatically Improve Speed of Convergence
  7. State-of-the-Art Statistical Science to Tackle Famous Number Theory Conjectures
  8. New Perspective on Fermat's Last Theorem
  9. Fun Math: Infinite Nested Radicals of Random Variables - Connection with Fractals and Brownian Motions
  10. Surprising Uses of Synthetic Random Data Sets
  11. Two New Deep Conjectures in Probabilistic Number Theory
  12. Extreme Events Modeling Using Continued Fractions
  13. A Strange Family of Statistical Distributions
  14. Some Fun with Gentle Chaos, the Golden Ratio, and Stochastic Number Theory
  15. Fascinating New Results in the Theory of Randomness
  16. From Infinite Matrices to New Integration Formula
  17. New Mathematical Conjecture?
  18. Cool Problems in Probabilistic Number Theory and Set Theory
  19. Fractional Exponentials - Dataset to Benchmark Statistical Tests
  20. Two Beautiful Mathematical Results - Part 2
  21. Two Beautiful Mathematical Results
  22. Four Interesting Math Problems
  23. Number Theory: Nice Generalization of the Waring Conjecture
  24. Fascinating Chaotic Sequences with Cool Applications
  25. Representation of Numbers with Incredibly Fast Converging Fractions
  26. Yet Another Interesting Math Problem - The Collatz Conjecture
  27. Simple Proof of the Prime Number Theorem
  28. Factoring Massive Numbers: Machine Learning Approach
  29. Representation of Numbers as Infinite Products
  30. A Beautiful Probability Theorem
  31. Fascinating Facts and Conjectures about Primes and Other Special Nu...
  32. Three Original Math and Proba Challenges, with Tutorial
  33. Challenges of the week


  1. Why You Should be a Data Science Generalist - and How to Become One
  2. Is a PhD helpful for a data science career?
  3. Full Stack Data Scientist: The Elusive Unicorn and Data Hacker
  4. Are data science or stats curricula in US too specialized?
  5. How do you identify an actual data scientist?
  6. Is it still possible today to become a self-taught data scientist?
  7. Why Logistic Regression should be the last thing you learn when becoming a Data Scientist
  8. 5 Myths About PhD Data Scientists
  9. Can you be sued for using the wrong data?


  1. Six Degrees of Separation Between Any Two Data Sets
  2. 7 Simple Tricks to Handle Complex Machine Learning Issues
  3. From Machine Learning to Machine Unlearning
  4. First Doctorship in Data Science
  5. Python Overtakes R for Data Science and Machine Learning
  6. Mars Craters: An Interesting Stochastic Geometry Problem
  7. Sample Projects for Data Scientists in Training
  8. Number Representation Systems Explained in One Picture
  9. Data Science Cheat Sheet
  10. Hitchhiker's Guide to Data Science, Machine Learning, R, Python
  11. Answers to dozens of data science job interview questions
  12. Advanced Machine Learning with Basic Excel
  13. Difference between ML, Data Science, AI, Deep Learning, and Statistics
Follow me on LinkedIn | Twitter | Facebook.

Data Shaping Solutions LLC, 4511 Cutter Drive, Anacortes, WA 98221 | Contact: vincentg@datashaping.com