Generative AI: 1. Ethics 2.CLIP: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 13: Line 13:
!scope="row"|Week 4
!scope="row"|Week 4
|  
|  
* Paper reading
Paper reading.  <br>
*Existing RLHF and RLAIF exploring
Existing RLHF and RLAIF exploring.  <br>
*Red-teaming dataset exploring
Red-teaming dataset exploring
|  
|  
|
|
Line 25: Line 25:
|
|
|  
|  
* Familiarizing with Dromedary, SALMON, Llama base models.
Familiarizing with Dromedary, SALMON, Llama base models.
|
|
|
|
Line 35: Line 35:
|
|
|  
|  
* Evaluation of different base models.
Evaluation of different base models. <br>
* Choice of using Llama 2 model as our baseline.
Choice of using Llama 2 model as our baseline.
|
|
|-
|-
Line 43: Line 43:
!scope="row"|Week 7
!scope="row"|Week 7
|  
|  
* Red teaming dataset exploration.
Red teaming dataset exploration. <br>
* Reading about ethical theories.
Reading about ethical theories. <br>
|
|
|
|
Line 53: Line 53:
!scope="row"|Week 8
!scope="row"|Week 8
|
|
* ETHICS dataset discovering.
[https://github.com/hendrycks/ethics ETHICS dataset] discovering.
|  
|  
*
 
|
|
|
|
Line 64: Line 64:
|
|
|
|
* ETHICS dataset formatting for Llama fine-tuning and evaluation.
ETHICS dataset formatting for Llama fine-tuning and evaluation.
* Llama supervised model fine-tuning
Llama supervised model fine-tuning
|  
|  
*
|
|
|-
|-
Line 76: Line 75:
|
|
|  
|  
* Evaluation of Llama model before and after fine-tuning with ETHICS dataset.
Evaluation of Llama model before and after fine-tuning with ETHICS dataset.
|  
|  
* Mid-term Presentation & Start writing the Wikipedia page with the plan.
Mid-term Presentation & Start writing the Wikipedia page with the plan.
|-
|-


Line 84: Line 83:
!scope="row"|Week 11
!scope="row"|Week 11
|
|
* Read about Reinforcement learning using PPO.
Read about Reinforcement learning using PPO.
|
|
* Re-formatting deontology dataset.
Re-formatting deontology dataset. <br>
* Creation of the preference model.
Creation of the preference model.
|
|
|
|

Revision as of 20:28, 1 December 2023

Project Plan and Milestones

Weekly Plan

Date Exploration Application Evaluation Report
Week 4

Paper reading.
Existing RLHF and RLAIF exploring.
Red-teaming dataset exploring

Week 5

Familiarizing with Dromedary, SALMON, Llama base models.

Week 6

Evaluation of different base models.
Choice of using Llama 2 model as our baseline.

Week 7

Red teaming dataset exploration.
Reading about ethical theories.

Week 8

ETHICS dataset discovering.

Week 9

ETHICS dataset formatting for Llama fine-tuning and evaluation. Llama supervised model fine-tuning

Week 10

Evaluation of Llama model before and after fine-tuning with ETHICS dataset.

Mid-term Presentation & Start writing the Wikipedia page with the plan.

Week 11

Read about Reinforcement learning using PPO.

Re-formatting deontology dataset.
Creation of the preference model.

Week 12
Week 13
Week 14 Write the Wikipedia page & Final presentation

Milestone 1

  • Choose the project subject.
  • Read papers about the existing studies in this field.
  • Define our research questions.

Milestone 2

  • Refine our research questions.
  • Explore different ethical theories.
  • Find an appropriate dataset.
  • Evaluate our fine-tuned supervised model.

Milestone 3

  • Get our Preference and the Reinforcement learning models.
  • Analyze the results.
  • Write the Wikipedia page.