function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
Diwakar G 7Diwakar G 7 

Reading the body of Attachment using Python

Hi,
I am trying to read the body of attachment and writing in a text file using Python. Below is the sample code. 
import urllib
import requests
import base64
import json
from simple_salesforce import Salesforce
import urllib.request as urllib



instance = ''

sf = Salesforce(username='username', password='password', security_token='security_token')
sessionId = sf.session_id

attachment = sf.query("SELECT id, name,Body FROM Attachment where parentID =''")
body = ""

req = urllib.Request('https://%s.salesforce.com/services/data/v38.0/sobjects/Attachment/<id>/Body/' % instance,
    headers = { 'Content-Type': 'application/json', 'Authorization': 'Bearer %s' % sessionId })
f = urllib.urlopen(req)
f1 = open("demofile.txt", "wb")
for x in f:
    f1.write(x)
f.close()
f1.close()
But, I am getting below error.
f1.write(x)
TypeError: write() argument must be str, not bytes
Please correct me, if something is wrong.

Thanks and Regards,
Diwakar G
Best Answer chosen by Diwakar G 7
Alain CabonAlain Cabon
Hi Diwakar,

f1.write(response.content) for binary content.
import requests
import os
from simple_salesforce import Salesforce

sf = Salesforce(username='username', password='password', security_token='security_token')
sessionId = sf.session_id
instance = sf.sf_instance
print ('sessionId: ' + sessionId)

attachment = sf.query("SELECT Id, Name, Body FROM Attachment where Name='test.xlsx' LIMIT 1")
filename=attachment['records'][0]['Name']
fileid=attachment['records'][0]['Id']
print('filename: ' + filename)
print('fileid: ' + fileid)

response = requests.get('https://' + instance + '/services/data/v39.0/sobjects/Attachment/' + fileid + '/body',
    headers = { 'Content-Type': 'application/text', 'Authorization': 'Bearer ' + sessionId })

f1 = open(filename, "wb")
f1.write(response.content)
f1.close()
print('output file: '  + os.path.realpath(f1.name))
response.close()

 

All Answers

Himanshu Patel 46Himanshu Patel 46
Hi Diwakar, 

The urlopen.urlOpen will return byte object. you need to change it urllib.ourlopen(req).read().decode('utf-8')

Let me know how it goes. 

 
Alain CabonAlain Cabon
Hi,
 
import requests
import base64
from simple_salesforce import Salesforce

sf = Salesforce(username='username', password='password', security_token='security_token')
sessionId = sf.session_id
instance = sf.sf_instance
print (sessionId)

attachment = sf.query("SELECT Id, Name, Body FROM Attachment where Id ='<id>'")
print(attachment)

response = requests.get('https://' + instance + '/services/data/v39.0/sobjects/Attachment/<id>/body',
    headers = { 'Content-Type': 'application/text', 'Authorization': 'Bearer ' + sessionId })
print(response.text)

f1 = open("demofile.txt", "wb")
f1.write(response.text)
f1.close()

response.close()

 
Diwakar G 7Diwakar G 7
Hi Alain,

Below is the output of print(response.text)
��Ok�0������%�*%���.H�=��8�%�����~�6I��sY�F���F���7>� ����Y.26�.�
�}��"2$J�c�B
Along with that I am getting below error.
f1.write(response.text)
TypeError: a bytes-like object is required, not 'str'
Thanks and Regards,
Diwakar G
 
Diwakar G 7Diwakar G 7
Hi Himanshu,

I am getting the below error.
f = urllib.urlopen(req).read().decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 22: invalid start byt

Thanks and Regards,
Diwakar G
Alain CabonAlain Cabon
Hi Diwakar,

f1.write(response.content) for binary content.
import requests
import os
from simple_salesforce import Salesforce

sf = Salesforce(username='username', password='password', security_token='security_token')
sessionId = sf.session_id
instance = sf.sf_instance
print ('sessionId: ' + sessionId)

attachment = sf.query("SELECT Id, Name, Body FROM Attachment where Name='test.xlsx' LIMIT 1")
filename=attachment['records'][0]['Name']
fileid=attachment['records'][0]['Id']
print('filename: ' + filename)
print('fileid: ' + fileid)

response = requests.get('https://' + instance + '/services/data/v39.0/sobjects/Attachment/' + fileid + '/body',
    headers = { 'Content-Type': 'application/text', 'Authorization': 'Bearer ' + sessionId })

f1 = open(filename, "wb")
f1.write(response.content)
f1.close()
print('output file: '  + os.path.realpath(f1.name))
response.close()

 
This was selected as the best answer
Diwakar G 7Diwakar G 7
Thank you so much Alain.
jacob alex 6jacob alex 6
The error you're encountering, TypeError: write() argument must be str, not bytes, occurs because you're trying to write binary data to a text file using the write() method, which expects a string. To fix this, you should open the file in binary write mode ('wb') instead of text write mode ('w') to write the binary data to the file. Here's a modified version of your code to do this:
import urllib.request as urllib

# Your Salesforce instance URL
instance = 'your_instance_url'

# Use the simple_salesforce library to authenticate
sf = Salesforce(username='username', password='password', security_token='security_token')
sessionId = sf.session_id

# Query for Attachments
attachments = sf.query("SELECT id, name, Body FROM Attachment WHERE ParentId = '<your_parent_id>'")

for attachment in attachments['records']:
    attachment_id = attachment['Id']

    # Make a request to get the attachment body
    req = urllib.Request('https://{}/services/data/v38.0/sobjects/Attachment/{}/Body'.format(instance, attachment_id),
                        headers={'Authorization': 'Bearer {}'.format(sessionId)})
    
    with urllib.urlopen(req) as f:
        with open(attachment['Name'], 'wb') as f1:
            f1.write(f.read())
In this modified code:
The attachment body is read as binary data, and we open the file in binary write mode ('wb') using open.
We loop through the attachments fetched from Salesforce and download each attachment's body, saving it with the name specified in the 'Name' field.
Make sure to replace <your_instance_url> with your Salesforce instance URL and <your_parent_id> with the appropriate ParentId for the attachments you want to retrieve. Answer copied from https://codingspell.com/