Machine Learning Assessment & Roadmap

Machine learning can add power to your analysis of pathology or remote sensing images:

  • Assist expert by improving efficiency, precision, and repeatability
  • Learn concepts beyond human capabilities like molecular biomarkers, patient outcomes, or treatment response directly from images

Using my proprietary process, I'll author a detailed analysis of your ML algorithms for quantifying images. This report will outline how to advance your project using the latest state-of-the-art techniques.


The report will detail each of the components below and make recommendations such as the following:

1. Images and annotations

  • Data preprocessing or cleaning steps to improve model performance
  • Ways to better handle challenges due to data set size, image size and appearance, or availability and accuracy of annotations
  • Additional data that could be used to improve model (e.g., public data sets, additional modalities of data)

2. Related work

  • References to existing literature that can guide expectations
  • Links to relevant toolkits

3. Metrics

  • Alternative metrics to better capture project objectives and measure progress

4. Modeling

  • Suggestions for better modeling of unique aspects of the problem or incorporating additional data
  • Transfer learning strategies to improve model initialization
  • Optimization strategies to improve model training
  • Different model types that may improve results

5. Analysis

  • Procedure for validating model on held out test set
  • Steps for performing an error analysis to guide directions for data cleaning or model improvement
  • Model generalizability considerations

The Process:

1. Discovery call

Let's hop on a call to learn more about each other. We'll discuss where you are now and where you need to go, what's working well and what you're struggling with.

2. Data and info gathering

I'll talk with your technical team about your current data pipeline and algorithms. I'll likely ask for any existing documentation, data samples, and possibly some code.

3. Report generation

Guided by the 5 components above, I'll write a report outlining the strengths of your current data and modeling approach and recommendations for further advancements.

4. Review

Upon delivering the report, I'll schedule a Zoom call to review my findings with your team and answer any questions.


Results:

  • An understanding of the key components for success with machine learning
  • A set of improvements that can be implemented and tested by in-house data and ML engineers
  • Confidence in the success of the project

Anticipated timeline:

3-4 weeks to deliver report, followed by a Zoom call to review