Generative AI: 1. Ethics 2.CLIP: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
Line 11: Line 11:
!scope="row"|Week 4
!scope="row"|Week 4
|  
|  
Paper reading.  <br>
* Paper reading.  <br>
Existing RLHF and RLAIF exploring.  <br>
* Existing RLHF and RLAIF exploring.  <br>
Red-teaming dataset exploring.
* Red-teaming dataset exploring.
|√
|√
|-
|-
Line 20: Line 20:
!scope="row"|Week 5
!scope="row"|Week 5
|
|
Familiarizing with Dromedary, SALMON, Llama base models.
* Familiarizing with Dromedary, SALMON, Llama base models.
|√
|√
|-
|-
Line 27: Line 27:
!scope="row"|Week 6
!scope="row"|Week 6
|
|
Evaluation of different base models. <br>
* Evaluation of different base models. <br>
Choice of using Llama 2 model as our baseline.
* Choice of using Llama 2 model as our baseline.
|√
|√
|-
|-
Line 35: Line 35:
!scope="row"|Week 7
!scope="row"|Week 7
|  
|  
Red teaming dataset exploration. <br>
* Red teaming dataset exploration. <br>
Reading about ethical theories.  <br>
* Reading about ethical theories.  <br>
|√
|√
|-
|-
Line 43: Line 43:
!scope="row"|Week 8
!scope="row"|Week 8
|
|
[https://github.com/hendrycks/ethics ETHICS dataset] discovering.
* [https://github.com/hendrycks/ethics ETHICS dataset] discovering.
|√
|√
|-
|-
Line 50: Line 50:
!scope="row"|Week 9
!scope="row"|Week 9
|
|
ETHICS dataset formatting for Llama fine-tuning and evaluation.<br>
* ETHICS dataset formatting for Llama fine-tuning and evaluation.<br>
Llama supervised model fine-tuning
* Llama supervised model fine-tuning
|√
|√
|-
|-
Line 58: Line 58:
!scope="row"|Week 10
!scope="row"|Week 10
|
|
Evaluation of Llama model before and after fine-tuning with ETHICS dataset.<br>
* Evaluation of Llama model before and after fine-tuning with ETHICS dataset.<br>
Mid-term Presentation & Start writing the Wikipedia page with the plan.
* Mid-term Presentation & Start writing the Wikipedia page with the plan.
|√
|√
|-
|-
Line 66: Line 66:
!scope="row"|Week 11
!scope="row"|Week 11
|
|
Read about Reinforcement learning using PPO.<br>
* Read about Reinforcement learning using PPO.<br>
Re-formatting deontology dataset.  <br>
* Re-formatting deontology dataset.  <br>
Creation of the preference model.
* Creation of the preference model.
|√
|√
|-
|-




!scope="row"|Week 12|
!scope="row"|Week 12
|
|
|
|
Line 87: Line 87:
!scope="row"|Week 14  
!scope="row"|Week 14  
|  
|  
Write the Wikipedia page & Final presentation
* Write the Wikipedia page & Final presentation
|√
|√
|}
|}

Revision as of 22:23, 4 December 2023

Project Plan and Milestones

Weekly Plan

Date Task Completion
Week 4
  • Paper reading.
  • Existing RLHF and RLAIF exploring.
  • Red-teaming dataset exploring.
Week 5
  • Familiarizing with Dromedary, SALMON, Llama base models.
Week 6
  • Evaluation of different base models.
  • Choice of using Llama 2 model as our baseline.
Week 7
  • Red teaming dataset exploration.
  • Reading about ethical theories.
Week 8
Week 9
  • ETHICS dataset formatting for Llama fine-tuning and evaluation.
  • Llama supervised model fine-tuning
Week 10
  • Evaluation of Llama model before and after fine-tuning with ETHICS dataset.
  • Mid-term Presentation & Start writing the Wikipedia page with the plan.
Week 11
  • Read about Reinforcement learning using PPO.
  • Re-formatting deontology dataset.
  • Creation of the preference model.
Week 12
Week 13
Week 14
  • Write the Wikipedia page & Final presentation

Milestone 1

  • Define our research questions.
  • Read papers about the existing studies in this field.
  • Explore different ethical theories.
  • Find an appropriate ethical dataset.

Milestone 2

  • Refine our research questions.
  • Finish the whole dataset.
  • Run the model and fine-tuned it on the GPU.
  • Evaluate our fine-tuned supervised model.

Milestone 3

  • Get our Preference and the Reinforcement learning models.
  • Analyze the results.
  • Write the Wikipedia page.