Generative AI: 1. Ethics 2.CLIP: Difference between revisions

Revision as of 21:40, 4 December 2023

Date	Task	Completion
Week 4	Paper reading. Existing RLHF and RLAIF exploring. Red-teaming dataset exploring.
Week 5	Familiarizing with Dromedary, SALMON, Llama base models.
Week 6	Evaluation of different base models. Choice of using Llama 2 model as our baseline.
Week 7	Red teaming dataset exploration. Reading about ethical theories.
Week 8	ETHICS dataset discovering.
Week 9	ETHICS dataset formatting for Llama fine-tuning and evaluation. Llama supervised model fine-tuning
Week 10	Evaluation of Llama model before and after fine-tuning with ETHICS dataset. Mid-term Presentation & Start writing the Wikipedia page with the plan.
Week 11	Read about Reinforcement learning using PPO. Re-formatting deontology dataset. Creation of the preference model.
Week 12
Week 13
Week 14		Write the Wikipedia page & Final presentation

@@ Line 20: / Line 20: @@
 !scope="row"|Week 5
 |
-|
 Familiarizing with Dromedary, SALMON, Llama base models.
 |-
@@ Line 29: / Line 28: @@
 Evaluation of different base models. <br>
 Choice of using Llama 2 model as our baseline.
-|
 |-
@@ Line 37: / Line 35: @@
 Red teaming dataset exploration. <br>
 Reading about ethical theories.  <br>
-|
 |-
@@ Line 44: / Line 41: @@
 |
 [https://github.com/hendrycks/ethics ETHICS dataset] discovering.
-|
 |-
@@ Line 51: / Line 46: @@
 !scope="row"|Week 9
 |
-ETHICS dataset formatting for Llama fine-tuning and evaluation.
+ETHICS dataset formatting for Llama fine-tuning and evaluation.<br>
 Llama supervised model fine-tuning
-|
 |-
@@ Line 59: / Line 53: @@
 !scope="row"|Week 10
 |
-Evaluation of Llama model before and after fine-tuning with ETHICS dataset.
+Evaluation of Llama model before and after fine-tuning with ETHICS dataset.<br>
-|
 Mid-term Presentation & Start writing the Wikipedia page with the plan.
 |-
@@ Line 67: / Line 60: @@
 !scope="row"|Week 11
 |
-Read about Reinforcement learning using PPO.
+Read about Reinforcement learning using PPO.<br>
-|
 Re-formatting deontology dataset.  <br>
 Creation of the preference model.