Documenting research progress through the semester.
Reading papers
Implementing GPT Baseline
Evaluated GPT-4
and GPT-3.5
on the Trial Outcome Prediction
datasets with various prompting strategies, including Chain of Thought, and breaking down the challenge into subtasks. When tested for binary classification with 30-50 uniformly distributed samples, the models showed very poor performance.