This is an updated version of the previous implementation regarding uploading a file larger than RAM size in Go. If you didn't read it before, you can check it out in the following link.
In last blog post, I tackled the the Go upload method using websocket file chunking implementation to handle cases where the uploaded file is much larger than available RAM on the device. This implementation is really helpful when you are developing applications for some cheap SBCs with as little RAM as 512MB.
Recently, I encountered another issue when I am trying to migrate my whole Google Drive to my own ARM powered DIY NAS. The issue was that my NAS only have 512MB + 32GB (microSD card) as OS drive, while I have 2 x 512GB HDD attached to the SBC to store files. Uploading a file with size >32GB will causing the system to run out of space and crashing my ArozOS NAS OS .
In the previous implementation, in order to upload a 1GB file, you will need 1GB space in your tmp folder (i.e. the SD card) in order for it to buffer the file chunks received via websocket. In the latest implementation, a new "huge file mode" was added to handle cases where uploading file > tmp folder space by directly writing the upload file chunks to target disk while minimizing the maximum required space on all the system disk. Before I show you the code on how it works, this is the logic for me to decide when to enter "huge file mode"
Logic for Optimizing Both Upload Space & Time Occupancy
If a file is smaller than 4MB, upload with FORM POST (to reduce overhead, fastest)
Else if the file is smaller than "The remaining space on tmp" / 16 - 1KB, the file is buffered into the tmp folder (tmp folder should be in fast medium like NVME SSDs or RAM Disk, slower than FORM POST but still fast)
Otherwise, the file chunks are directly buffered to disk (slowest, but provide us the most space to work with)
File Merging Procedures
In the previous implementation, the file merging procedures happens like this
Create the destination file and open it
Iterate through each chunks, append it to the opened destination file
Delete all the chunk files
*However, this would take 2x the space of the file being upload. * It works fine for medium sized files, but not good for huge files. To solve this, I have changed the implementation to the followings.
Create the definition file and open it
Iterate through each chunks, append each chunks to the opened destination file, confirm the copy is success and remove the source chunks
In simple words, by deleting file on the fly, the new upload logic only takes up (x + c) bytes size, where x is the file size and c is the chunk size. In my design, c is 512KB.
The Code
There is no change to the front end code except there is an extra GET parameter when opening the websocket to define if the current upload is huge file upload. The following is an example implementation for the websocket object
lethugeFileMode="";if (file.size>largeFileCutoffSize){//Filesize over cutoff line. Use huge file modehugeFileMode="&hugefile=true";}letsocket=newWebSocket(protocol+window.location.hostname+":"+port+"/system/file_system/lowmemUpload?filename="+encodeURIComponent(filename)+"&path="+encodeURIComponent(uploadDir)+hugeFileMode);
And here is the Go backend side implementation. Note the
isHugeFile flag and //Merge the file section.
targetUploadLocation:=filepath.Join(uploadPath,filename)if!fs.FileExists(uploadPath){os.MkdirAll(uploadPath,0755)}//Generate an UUID for this uploaduploadUUID:=uuid.NewV4().String()uploadFolder:=filepath.Join(*tmp_directory,"uploads",uploadUUID)ifisHugeFile{//Upload to the same directory as the target location.uploadFolder=filepath.Join(uploadPath,".metadata/.upload",uploadUUID)}os.MkdirAll(uploadFolder,0700)//Start websocket connectionvarupgrader=websocket.Upgrader{}upgrader.CheckOrigin=func(r*http.Request)bool{returntrue}c,err:=upgrader.Upgrade(w,r,nil)iferr!=nil{log.Println("Failed to upgrade websocket connection: ",err.Error())w.WriteHeader(http.StatusInternalServerError)w.Write([]byte("500 WebSocket upgrade failed"))return}deferc.Close()//Handle WebSocket uploadblockCounter:=0chunkName:=[]string{}lastChunkArrivalTime:=time.Now().Unix()//Setup a timeout listener, check if connection still active every 1 minuteticker:=time.NewTicker(60*time.Second)done:=make(chanbool)gofunc(){for{select{case<-done:returncase<-ticker.C:iftime.Now().Unix()-lastChunkArrivalTime>300{//Already 5 minutes without new data arraival. Stop connectionlog.Println("Upload WebSocket connection timeout. Disconnecting.")c.WriteControl(8,[]byte{},time.Now().Add(time.Second))time.Sleep(1*time.Second)c.Close()return}}}}()totalFileSize:=int64(0)for{mt,message,err:=c.ReadMessage()iferr!=nil{//Connection closed by client. Clear the tmp folder and exitlog.Println("Upload terminated by client. Cleaning tmp folder.")//Clear the tmp foldertime.Sleep(1*time.Second)os.RemoveAll(uploadFolder)return}//The mt should be 2 = binary for file upload and 1 for control syntaxifmt==1{msg:=strings.TrimSpace(string(message))ifmsg=="done"{//Start the merging processbreak}else{//Unknown operations}}elseifmt==2{//File block. Save it to tmp folderchunkFilepath:=filepath.Join(uploadFolder,"upld_"+strconv.Itoa(blockCounter))chunkName=append(chunkName,chunkFilepath)writeErr:=ioutil.WriteFile(chunkFilepath,message,0700)ifwriteErr!=nil{//Unable to write block. Is the tmp folder fulled?log.Println("[Upload] Upload chunk write failed: "+err.Error())c.WriteMessage(1,[]byte(`{\"error\":\"Write file chunk to disk failed\"}`))//Close the connectionc.WriteControl(8,[]byte{},time.Now().Add(time.Second))time.Sleep(1*time.Second)c.Close()//Clear the tmp filesos.RemoveAll(uploadFolder)return}//Update the last upload chunk timelastChunkArrivalTime=time.Now().Unix()//Check if the file size is too bigtotalFileSize+=fs.GetFileSize(chunkFilepath)iftotalFileSize>max_upload_size{//File too bigc.WriteMessage(1,[]byte(`{\"error\":\"File size too large\"}`))//Close the connectionc.WriteControl(8,[]byte{},time.Now().Add(time.Second))time.Sleep(1*time.Second)c.Close()//Clear the tmp filesos.RemoveAll(uploadFolder)return}blockCounter++//Request client to send the next chunkc.WriteMessage(1,[]byte("next"))}}//Try to decode the location if possibledecodedUploadLocation,err:=url.QueryUnescape(targetUploadLocation)iferr!=nil{decodedUploadLocation=targetUploadLocation}//Do not allow % sign in filename. Replace all with underscoredecodedUploadLocation=strings.ReplaceAll(decodedUploadLocation,"%","_")//Merge the fileout,err:=os.OpenFile(decodedUploadLocation,os.O_CREATE|os.O_WRONLY,0755)iferr!=nil{log.Println("Failed to open file:",err)c.WriteMessage(1,[]byte(`{\"error\":\"Failed to open destination file\"}`))c.WriteControl(8,[]byte{},time.Now().Add(time.Second))time.Sleep(1*time.Second)c.Close()return}for_,filesrc:=rangechunkName{srcChunkReader,err:=os.Open(filesrc)iferr!=nil{log.Println("Failed to open Source Chunk",filesrc," with error ",err.Error())c.WriteMessage(1,[]byte(`{\"error\":\"Failed to open Source Chunk\"}`))return}io.Copy(out,srcChunkReader)srcChunkReader.Close()//Delete file immediately to save spaceos.Remove(filesrc)}out.Close()//Return complete signalc.WriteMessage(1,[]byte("OK"))//Stop the timeout listnerdone<-true//Clear the tmp foldertime.Sleep(300*time.Millisecond)err=os.RemoveAll(uploadFolder)iferr!=nil{log.Println(err)}//Close WebSocket connection after finishedc.WriteControl(8,[]byte{},time.Now().Add(time.Second))time.Sleep(300*time.Second)c.Close()
And now you can have infinite file upload size
So there you have it. Now you can upload infinitely large file into your system as soon as you have enough disk space to store it. Notes that this upload method is very slow. It takes more than 2 time the speed than the previous method to actually merge the file as both of the reading file chunks and writing destination file are on the same disk. But for my use case, at least it works well enough for files that is too large to fit into the system RAM or the tmp/ folder.
I have no idea who will ever find this useful other than myself working on the ArozOS project. People with these issues nowadays usually just dump the file to AWS or whatever cloud service provider providing large file storage. But if you find it useful or you got even better implementation, feel free to let me know so we can further improve the design :)