4 Simple Ways to Split Word Documents Using C#
Lucy Muturi
Posted on August 16, 2024
TL;DR: Learn to split Word documents into manageable segments using C# and the Syncfusion .NET Word Library (DocIO). This guide provides practical code examples for splitting documents by sections, headings, bookmarks, and placeholder text, making document management more efficient.
Handling extensive Word documents can be challenging, especially in terms of collaboration and organization. Breaking these documents into smaller, more manageable sections can significantly enhance productivity and streamline your workflow.
The Syncfusion .NET Word Library (DocIO) provides a robust and programmatic solution for splitting Word documents without needing Microsoft Word or any interop dependencies. With DocIO, you can effortlessly split documents and save them in various formats such as DOCX, RTF, HTML, PDF, images, and more.
In this blog, we’ll explore four easy methods for splitting Word documents using C#, which will help you optimize your document management process.
Note: Before proceeding, refer to the .NET Word Library getting started documentation.
Getting started with the .NET Word library
Follow these steps to get started with the .NET Word Library:
Step 1: Create a new C# .NET Console App in Visual Studio.
Step 2: Then, install the Syncfusion.DocIO.Net.Core NuGet package as a reference from NuGet Gallery.
Step 3: Include the following namespaces in the Program.cs file.
using Syncfusion.DocIO;
using Syncfusion.DocIO.DLS;
The project is set up! Now, let’s explore how to split Word documents using various methods.
Split a Word document by sections
If you have a comprehensive Word document with multiple sections, managing the entire document can be cumbersome for distribution and review.
To simplify this process, you can split the Word document into individual files for each section. This approach allows team members to focus on specific parts, improving collaboration and streamlining the review process.
Refer to the following code example to split a Word document by sections using C#.
using (FileStream inputStream = new FileStream("Template.docx", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
// Open an existing Word document.
using (WordDocument document = new WordDocument(inputStream, FormatType.Docx))
{
int sectionNumber = 1;
// Iterate each section from the Word document.
foreach (WSection section in document.Sections)
{
// Create a new Word document.
using (WordDocument newDocument = new WordDocument())
{
// Clone and add a section from one Word document to another.
newDocument.Sections.Add(section.Clone());
// Save the Word document.
using (FileStream outputStream = new FileStream("Section" + sectionNumber + ".docx", FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
newDocument.Save(outputStream, FormatType.Docx);
}
}
sectionNumber++;
}
}
}
After executing this code example, we’ll get the output like in the following image.
Split a Word document by headings
A lengthy user manual with multiple heading levels can be challenging to manage. By splitting the manual into separate documents based on major headings, reviewers can concentrate on specific topics, simplifying the review process.
Refer to the following code example to split a Word document by headings using C#.
using (FileStream inputStream = new FileStream("Template.docx", FileMode.Open, FileAccess.Read))
{
// Open an existing Word document.
using (WordDocument document = new WordDocument(inputStream, FormatType.Docx))
{
WordDocument newDocument = null;
WSection newSection = null;
int headingIndex = 0;
// Iterate each section in the Word document.
foreach (WSection section in document.Sections)
{
// Clone the section without items and add it to a new document.
if (newDocument != null)
newSection = AddSection(newDocument, section);
// Iterate each child entity in the Word document.
foreach (TextBodyItem item in section.Body.ChildEntities)
{
// If the item is a paragraph, then check for heading style and split
// else, add the item to a new document.
if (item is WParagraph)
{
WParagraph paragraph = item as WParagraph;
// If a paragraph has Heading 1 style, then save the traversed content as a separate document.
// And create a new document for new heading content.
if (paragraph.StyleName == "Heading 1")
{
if (newDocument != null)
{
// Save the Word document.
string fileName = "Document" + (headingIndex + 1) + ".docx";
SaveWordDocument(newDocument, fileName);
headingIndex++;
}
// Create a new document for new heading content.
newDocument = new WordDocument();
newSection = AddSection(newDocument, section);
AddEntity(newSection, paragraph);
}
else if (newDocument != null)
AddEntity(newSection, paragraph);
}
else
AddEntity(newSection, item);
}
}
// Save the remaining content as a separate document.
if (newDocument != null)
{
// Save the Word document.
string fileName = "Document" + (headingIndex + 1) + ".docx";
SaveWordDocument(newDocument, fileName);
}
}
}
After executing this code example, we’ll get the output like in the following image.
Splitting a Word document by bookmarks
Managing and distributing the content can be challenging if your document is divided into segments marked by bookmarks. Splitting the document based on these bookmarks allows you to handle each segment separately, making it easier to work with distinct parts of the content.
Refer to the following code example to split a Word document by bookmarks using C#.
using (FileStream fileStreamPath = new FileStream("Template.docx", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
// Open an existing Word document.
using (WordDocument document = new WordDocument(fileStreamPath, FormatType.Docx))
{
// Create the bookmark navigator instance to access the bookmark.
BookmarksNavigator bookmarksNavigator = new BookmarksNavigator(document);
// Get all bookmarks in the Word document.
BookmarkCollection bookmarkCollection = document.Bookmarks;
// Iterate each bookmark in the Word document.
foreach (Bookmark bookmark in bookmarkCollection)
{
// Move the virtual cursor to the location before the end of the bookmark.
bookmarksNavigator.MoveToBookmark(bookmark.Name);
// Get the bookmark content as WordDocumentPart.
WordDocumentPart documentPart = bookmarksNavigator.GetContent();
// Save the WordDocumentPart as a separate Word document.
using (WordDocument newDocument = documentPart.GetAsWordDocument())
{
// Save the Word document to a file stream.
using (FileStream outputFileStream = new FileStream(bookmark.Name + ".docx", FileMode.Create, FileAccess.ReadWrite))
{
newDocument.Save(outputFileStream, FormatType.Docx);
}
}
}
}
}
After executing this code example, we’ll get the output like in the following image.
Splitting a Word document by placeholders
If your document contains placeholder text (e.g., <> and <>) to indicate split points, managing these divisions manually can be complex. Use these placeholders to automatically create separate Word files for each part, ensuring organized division based on predefined markers.
Refer to the following code example to split a Word document by placeholder text using C#.
using (FileStream fileStreamPath = new FileStream("Template.docx", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
// Open an existing Word document.
using (WordDocument document = new WordDocument(fileStreamPath, FormatType.Docx))
{
// Find all the placeholder text in the Word document.
TextSelection[] textSelections = document.FindAll(new Regex("<<(.*)>>"));
if (textSelections != null)
{
// Unique ID for each bookmark.
int bkmkId = 1;
// Collection to hold the newly inserted bookmarks.
List<string> bookmarks = new List<string>();
#region Add bookmark start and end in the place of placeholder text
//Iterate each text selection.
for (int i = 0; i < textSelections.Length; i++)
{
// Get the placeholder as WTextRange.
WTextRange textRange = textSelections[i].GetAsOneRange();
// Get the index of the placeholder text.
WParagraph startParagraph = textRange.OwnerParagraph;
int index = startParagraph.ChildEntities.IndexOf(textRange);
string bookmarkName = "Bookmark_" + bkmkId;
// Add new bookmark to bookmarks collection.
bookmarks.Add(bookmarkName);
// Create bookmark start.
BookmarkStart bkmkStart = new BookmarkStart(document, bookmarkName);
// Insert the bookmark start before the start placeholder.
startParagraph.ChildEntities.Insert(index, bkmkStart);
// Remove the placeholder text.
textRange.Text = string.Empty;
i++;
// Get the placeholder as WTextRange.
textRange = textSelections[i].GetAsOneRange();
// Get the index of the placeholder text.
WParagraph endParagraph = textRange.OwnerParagraph;
index = endParagraph.ChildEntities.IndexOf(textRange);
// Create bookmark end.
BookmarkEnd bkmkEnd = new BookmarkEnd(document, bookmarkName);
// Insert the bookmark end after the end placeholder.
endParagraph.ChildEntities.Insert(index + 1, bkmkEnd);
bkmkId++;
// Remove the placeholder text.
textRange.Text = string.Empty;
}
#endregion
#region Split document based on newly inserted bookmarks
BookmarksNavigator bookmarksNavigator = new BookmarksNavigator(document);
int fileIndex = 1;
foreach (string bookmark in bookmarks)
{
// Move the virtual cursor to the location before the end of the bookmark.
bookmarksNavigator.MoveToBookmark(bookmark);
// Get the bookmark content as WordDocumentPart.
WordDocumentPart wordDocumentPart = bookmarksNavigator.GetContent();
// Save the WordDocumentPart as a separate Word document.
using (WordDocument newDocument = wordDocumentPart.GetAsWordDocument())
{
// Save the Word document to a file stream.
using (FileStream outputFileStream = new FileStream("Placeholder_" + fileIndex + ".docx", FileMode.Create, FileAccess.ReadWrite))
{
newDocument.Save(outputFileStream, FormatType.Docx);
}
}
fileIndex++;
}
#endregion
}
}
}
After executing this code example, we’ll get the output like in the following image.
References
For more details, refer to splitting Word documents using .NET Word Library and C# GitHub demo and documentation.
Conclusion
Thanks for reading! In this blog, we’ve seen how to split Word documents using Syncfusion .NET Word Library (DocIO) and C#. Take a moment to peruse our documentation, where you will find other options and features, all with accompanying code examples.
Apart from this, our Syncfusion .NET Word Library has the following significant functionalities:
- Create, read, and edit Word documents programmatically.
- Create complex reports by merging data into a Word template from various data sources through mail merge.
- Merge, compare, and organize Word documents.
- Convert Word documents into HTML, RTF, PDF, images, and other formats.
You can also find more examples of Word Library on this GitHub location.
Are you already a Syncfusion user? You can download the product setup here. If you’re not yet a Syncfusion user, you can download a 30-day free trial.
If you have questions, contact us through our support forum, support portal, or feedback portal. We are always happy to assist you!
Related blogs
Posted on August 16, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.