Notes on S3 backed File Downloads
David Israel
Posted on June 19, 2020
Uclusion recently introduced arbitrary file uploads to our Workspaces or Stories (we previously allowed only images), and there are a few implementation notes I’d like to record for posterity.
Firstly, browsers do not always provide a content type when uploading a file. If your backend demands content types the code something like the below is needed when asking for a presigned post
export function uploadFileToS3 (marketId, file) {
const clientPromise = _.isEmpty(marketId) ? getAccountClient() : getMarketClient(marketId);
const { type, size, name } = file;
// if we don't have a content type, use generic octet stream
const usedType = _.isEmpty(type)? 'application/octet-stream' : type;
return client.getFileUploadData(usedType, size, name))
.then((data) => {
const { metadata, presigned_post } = data;
const { url, fields } = presigned_post;
See production file with this code.
Second, unless your served JS or HTML is coming from the same domain as your CDN, browsers will happily ignore the download attribute of S3 links. They will either open it in a new tab with their viewer, or, for security reasons, use the S3 bucket’s file’s name. This means that if you machine generate paths such as https://dev.imagecdn.uclusion.com/8432f156-f452-4031-8e7d-728deb25f7cd/35fed849-1b69-4f05-9cea-99f57d5f71e9 the user will end up with a file of 35fed849–1b69–4f05–9cea-99f57d5f71e9.
To avoid this, set your content distribution on your presigned post. This won’t effect the downloaded path, but is a method of telling the browser what you want the downloaded filename to be. Here’s code that does that, and also attempts to set the extension on the path appropriately:
def post_validation_function(event, data, context, validation_context):
content_length = data['content_length']
original_name = data.get('original_name', None)
content_type = data['content_type']
file_path = str(uuid.uuid4())
# See if they provided the original name,
# and if so use it, and try to put a good extension on
fields = {"Content-Type": content_type}
if original_name is not None:
fields["Content-Disposition"] = 'attachment; filename="' + original_name + '"'
extension = guess_file_extension(content_type)
if extension is not None:
file_path = file_path + extension
used_id = validation_context['market_id'] if validation_context['market_id'] is not None else validation_context['account_id']
path = used_id + '/' + file_path
return create_post(path, fields, content_type, content_length, bucket)
def create_post(path, fields, content_type, content_length, bucket):
expiration = 60 60 24 # Upload must occur within 24 hours
conditions = [
['content-length-range', content_length, content_length],
{'Content-Type': fields['Content-Type']},
{'Content-Disposition': fields['Content-Disposition']}
]
post = s3_client.generate_presigned_post(bucket,
path,
Fields=fields,
Conditions=conditions,
ExpiresIn=expiration)
return {
'metadata': {
'path': path,
'content_length': content_length,
'content_type': content_type
},
'presigned_post': post
}
Notice that I set the constraints to include all Fields. If you don’t do that your users will get Access Denied whenever they try to upload.
If you don’t mind the users getting generated names, then I would at least attempt to set the file extensions properly. The code above invokes the following function to do so, and extension setting is one of the reasons why we demand content-type to be passed in when doing a presigned post.
import mimetypes
# Constants to give the file upload some additional information on extensions, especially on MS mimetypes
#
MS_TYPES = {
'application/msword': '.doc',
'application/vnd.openxmlformats-officedocument.wordprocessingml.document': '.docx',
'application/mspowerpoint': '.ppt',
'application/vnd.openxmlformats-officedocument.presentationml.presentation': '.pptx',
'application/vnd.openxmlformats-officedocument.presentationml.template': '.potx',
'application/msexcel': '.xls',
'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet': '.xlsx',
'application/vnd.openxmlformats-officedocument.spreadsheetml.template': '.xltx',
'application/msproject': '.mpp',
}
def guess_file_extension(mime_type):
lowercase = mime_type.lower()
system = mimetypes.guess_extension(lowercase)
if system is not None:
return system
# the system doesn't know, try our MS map
return MS_TYPES.get(lowercase, None)
Posted on June 19, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024