Skip to content Skip to sidebar Skip to footer

Create Matrix Using Python

I have 6 text files (each corresponds to a specific sample) and each file looks like this: Gene_ID Gene_Name Strand Start End Length Coverage FPKM TPM ENSMUSG0000010273

Solution 1:

One way to do this is keep one dictionary that stores sample values for each gene_id.

Initialize dictionary = {}

Iterate through each of the 6 files and do:

for file in [f1,f2,f3..f6]:
   for line in file:
        labels = line.split(" ")
        val = 1if labels[8] else0if labels[0] not in dictionary:
        dictionary[labels[0]] = {'name' : labels[1], 'sample' : [val]}            
     else:
        dictionary[labels[0]]['sample'].append(val) 

This will store keys as gene_id and name, sample(list of 6 sample_ids) as values.

You can now write to output file just by iterating through the keys and values.

f = open("output.txt","w+")
f.write("gene_id,gene_name,sample1,sample2,sample3,sample4,sample5,sample6\n")
for key in dictionary.keys():
    samples = ",".join(dictionary[key]['sample'])
    f.write(dictionary[key]+","+dictionary[key]['name']+","+samples+"\n")
f.close()

Post a Comment for "Create Matrix Using Python"