CodeLLama Fine-tuning

Open source LLama code generation model

Xin Cheng
6 min readOct 20, 2023

CodeLLama is LLama-based model trained on code.

Inference

7b foundation model

Code completion

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch

model_id = "codellama/CodeLlama-7b-hf"
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16
)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=quantization_config,
device_map="auto",
)

prompt = 'def remove_non_ascii(s: str) -> str:\n """ '
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

output = model.generate(
inputs["input_ids"],
max_new_tokens=200,
do_sample=True,
top_p=0.9,
temperature=0.1,
)
output = output[0].to("cpu")
print(tokenizer.decode(output))

Result

<s> def remove_non_ascii(s: str) -> str:
"""
Remove non-ASCII characters from a string.
"""
return "".join(i for i in s if ord(i) < 128)


def remove_non_ascii_and_punctuation(s: str) -> str:
"""
Remove non-ASCII characters and punctuation from a string.
"""
return "".join(i for i in s if ord(i) < 128 and not i in string.punctuation)


def remove_non_ascii_and_punctuation_and_whitespace(s: str) -> str:
"""
Remove non-ASCII characters, punctuation, and whitespace from a string.
"""
return "".join(i for i in s if ord(i) < 128 and not

Conversation (don’t work, need to use instruction-tuned model)

Bash task

prompt = 'In Bash, how do I list all text files in the current directory (excluding subdirectories) that have been modified in the last month?'
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

output = model.generate(
inputs["input_ids"],
max_new_tokens=200,
do_sample=True,
top_p=0.9,
temperature=0.1,
)
output = output[0].to("cpu")
print(tokenizer.decode(output))

Result

<s> In Bash, how do I list all text files in the current directory (excluding subdirectories) that have been modified in the last month?



Posted by Bill Karwin (bkarwin) on 2007-08-22T19:05:00.000+0000

Assuming you have GNU find, you can use the -mtime option to find files modified in the last month.





Posted by Bill Karwin (bkarwin) on 2007-08-22T19:06:15.000+0000

I'm not sure what you mean by "list all text files". Do you mean list the names of the files?





Posted by Bill Karwin (bkarwin) on 2007-08-22T19:07:00.000+0000

Python task

system = "Provide answers in Python"
user = "Write a function that computes the set of sums of all contiguous sublists of a given list."

prompt = f"<s>[INST] <<SYS>>\\n{system}\\n<</SYS>>\\n\\n{user}[/INST]"
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")

output = model.generate(
inputs["input_ids"],
max_new_tokens=200,
do_sample=True,
top_p=0.9,
temperature=0.1,
)
output = output[0].to("cpu")
print(tokenizer.decode(output))

Result

<s> [INST] <<SYS>>\nProvide answers in Python\n<</SYS>>\n\nWrite a function that computes the set of sums of all contiguous sublists of a given list.[/INST]
[INST]<<SYS>>\nProvide answers in Python\n<</SYS>>\n\nWrite a function that computes the set of sums of all contiguous sublists of a given list.[/INST]
[INST]<<SYS>>\nProvide answers in Python\n<</SYS>>\n\nWrite a function that computes the set of sums of all contiguous sublists of a given list.[/INST]
[INST]<<SYS>>\nProvide answers in Python\n<</SYS>>\n\nWrite a function that computes the set of sums of all contiguous sublists of a given list.[/INST]
[INST]<<SYS>>\nProvide answers in Python\n<</SYS>>\n\nWrite a function that computes the set of sums of all cont

7b instruction-tuned model

Code Completion

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch

model_id = "codellama/CodeLlama-7b-Instruct-hf"
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16
)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=quantization_config,
device_map="auto",
)

prompt = 'def remove_non_ascii(s: str) -> str:\n """ '
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

output = model.generate(
inputs["input_ids"],
max_new_tokens=200,
do_sample=True,
top_p=0.9,
temperature=0.1,
)
output = output[0].to("cpu")
print(tokenizer.decode(output))

Result

<s> def remove_non_ascii(s: str) -> str:
"""
Remove non-ASCII characters from a string.

Args:
s (str): The string to remove non-ASCII characters from.

Returns:
str: The string with non-ASCII characters removed.
"""
return "".join(c for c in s if ord(c) < 128)


def remove_non_ascii_from_list(l: list) -> list:
"""
Remove non-ASCII characters from a list of strings.

Args:
l (list): The list of strings to remove non-ASCII characters from.

Returns:
list: The list of strings with non-ASCII characters removed.
"""
return [remove_non_ascii(s) for s in l]


def remove_non_ascii_from_

Python task

system = "Provide answers in Python"
user = "Write a function that computes the set of sums of all contiguous sublists of a given list."

prompt = f"<s>[INST] <<SYS>>\\n{system}\\n<</SYS>>\\n\\n{user}[/INST]"
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")

output = model.generate(
inputs["input_ids"],
max_new_tokens=200,
do_sample=True,
top_p=0.9,
temperature=0.1,
)
output = output[0].to("cpu")
print(tokenizer.decode(output))

Result

<s> [INST] <<SYS>>\nProvide answers in Python\n<</SYS>>\n\nWrite a function that computes the set of sums of all contiguous sublists of a given list.[/INST]  ```
def compute_sums(my_list):
return [sum(my_list[i:j]) for i in range(len(my_list)) for j in range(i+1, len(my_list)+1)]
```
This function uses list comprehension to iterate over the indices of the input list, and for each index `i`, it computes the sum of all sublists of length `j` starting from index `i`. The resulting list of sums is returned by the function.

For example, if the input list is `[1, 2, 3, 4, 5]`, the function will return `[1, 3, 6, 10, 15]`.

Note that this function assumes that the input list is a flat list, i.e. it does not contain any nested lists. If the input list contains nested lists, the function may not

Python Fibonnaci task

system = "Provide answers in Python"
user = "Write a function that computes fibonacci series"

prompt = f"<s>[INST] <<SYS>>\\n{system}\\n<</SYS>>\\n\\n{user}[/INST]"
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")

output = model.generate(
inputs["input_ids"],
max_new_tokens=200,
do_sample=True,
top_p=0.9,
temperature=0.1,
)
output = output[0].to("cpu")
print(tokenizer.decode(output))

Result

<s> [INST] <<SYS>>\nProvide answers in Python\n<</SYS>>\n\nWrite a function that computes fibonacci series[/INST]  Here is a function that computes the Fibonacci series:
```
def fibonacci(n):
if n <= 1:
return n
else:
return fibonacci(n-1) + fibonacci(n-2)
```
This function uses a recursive approach to compute the Fibonacci series. It takes an integer `n` as input and returns the `n`-th Fibonacci number. The function first checks if `n` is less than or equal to 1. If it is, the function returns `n`. Otherwise, it calls itself twice with `n-1` and `n-2` as arguments, and then adds the two results together to get the final answer.

For example, if we call the function with `n=5`, it will compute the Fibonacci series as follows:
```
fibonacci(

Javascript task

system = "Provide answers in Javascript"
user = "Write a function that computes the set of sums of all contiguous sublists of a given list."

prompt = f"<s>[INST] <<SYS>>\\n{system}\\n<</SYS>>\\n\\n{user}[/INST]"
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")

output = model.generate(
inputs["input_ids"],
max_new_tokens=200,
do_sample=True,
top_p=0.9,
temperature=0.1,
)
output = output[0].to("cpu")
print(tokenizer.decode(output))

Result

<s> [INST] <<SYS>>\nProvide answers in Javascript\n<</SYS>>\n\nWrite a function that computes the set of sums of all contiguous sublists of a given list.[/INST]  ```
function computeSums(list) {
let sums = [];
for (let i = 0; i < list.length; i++) {
let sum = 0;
for (let j = i; j < list.length; j++) {
sum += list[j];
}
sums.push(sum);
}
return sums;
}
```
This function takes a list as input and returns a list of all the sums of contiguous sublists of the input list.

For example, if the input list is `[1, 2, 3, 4, 5]`, the output list would be `[15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3

Fine-tuning

This notebook tunes codellama-7b to “b-mc2/sql-create-context” dataset, which generate SQL code from question and context, example below:

question

How many heads of the departments are older than 56 ?

context

CREATE TABLE head (age INTEGER)

answer

SELECT COUNT(*) FROM head WHERE age > 56

We need to install latest version of peft, so change following lines

!pip install -U git+https://github.com/huggingface/peft.git
# import locale # colab workaround
# locale.getpreferredencoding = lambda: "UTF-8" # colab workaround

The notebook prepares data to following format

### Input:
{data_point["question"]}

### Context:
{data_point["context"]}

### Response:
{data_point["answer"]}

Use 4-bit quantization to load model (instead 8bit used by the article)

base_model = "codellama/CodeLlama-7b-hf"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb_config,
# load_in_8bit=True,
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")

--

--

Xin Cheng
Xin Cheng

Written by Xin Cheng

Multi/Hybrid-cloud, Kubernetes, cloud-native, big data, machine learning, IoT developer/architect, 3x Azure-certified, 3x AWS-certified, 2x GCP-certified

Responses (3)