Get the Text out of a Microsoft word file. Read from MS word.
using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Runtime.InteropServices.ComTypes; namespace readDOC { class Program { static void Main(string[] args) { Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.Application(); object miss = System.Reflection.Missing.Value; object path = @"C:\DOC\myDocument.docx"; object readOnly = true; Microsoft.Office.Interop.Word.Document docs = word.Documents.Open(ref path, ref miss, ref readOnly, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss, ref miss); string totaltext = ""; for (int i = 0; i < docs.Paragraphs.Count; i++) { totaltext += " \r\n "+ docs.Paragraphs[i+1].Range.Text.ToString(); } Console.WriteLine(totaltext); docs.Close(); word.Quit(); } } } |
Thank you for your article, Good way to read word document in c#, the way may can not used in asp.net, I used spire.doc to read document in asp.net.
Yeah but this way, you have to have Word installed on this or the client machine :(
This is a nice free library: http://sourceforge.net/p/word-reader
NeoOne, the library in your post works for me because 1. It works with docx files; 2. you do not need Word installed on the user’s machine as you mentioned :P
It is a shame that there is no source code and the last update from the author is Sep 2012.
Anyway the dll does what it says on the tin. Thank you.
How about
string totaltext = docs.Content.Text;
to get all the text from a word document.
You are the king of the world. You know how many shitty articles are out there. You rule.
can any one say me how to load word file on website and extract some fields from document and display those fields in different textbox’s like how we do in resume upload in job postal sites
Hi Mirza,
i am also looking same code if u got pls send me at vivekj666@gmail.com
I have used above code but it is giving error as :
Error 1 The type or namespace name ‘Office’ does not exist in the namespace ‘Microsoft’ (are you missing an assembly reference?) c:\users\nagendra\documents\visual studio 2012\Projects\ConsoleApplication1\ConsoleApplication1\Program.cs 13 23 ConsoleApplication1
Error 2 The type or namespace name ‘Office’ does not exist in the namespace ‘Microsoft’ (are you missing an assembly reference?) c:\users\nagendra\documents\visual studio 2012\Projects\ConsoleApplication1\ConsoleApplication1\Program.cs 13 76 ConsoleApplication1
Error 3 The type or namespace name ‘Office’ does not exist in the namespace ‘Microsoft’ (are you missing an assembly reference?) c:\users\nagendra\documents\visual studio 2012\Projects\ConsoleApplication1\ConsoleApplication1\Program.cs 17 23 ConsoleApplication1
I want to read office file in windows form.
plz reply fast.
Add com reference in your project , follow the steps..
right click on your project name in project solution window
click on add reference
then select COM ,
search microsoft word 12.0 or 9.0 which is available
click ok ,
now your code is running…
^^
1. Add reference to your project as Microsoft.Office.Interop.Word .You can find it in the .NET section
2. In solution, add using Microsoft.Office.Interop.Word
regds
George
if I want particular paragraph from word document then how can I access it using above code…
Thank you very much for some great code. I have been struggling with this very activity, but you have made it clean and clear. Very nice! Again, Thanks.
You can get text or read your doc file by using this .NET API for Word . It is not a free API but offers free trial so you can try it. You can also get sample codes from their documentation page like i do because i have subscribed to their website.
How To Read Word (docx) document which has MathType Equations (OLE Object) and convert it to MathML using C#.
Thanks
hay thx a lot god bless u
I am not able to read the contents when i run this code on IIS server
I have a structured word template. I want to retrieve content in JSON format. Can anyone explain me how it can be achieved?
Very helpful :)
Thanks..
{“Creating an instance of the COM component with CLSID {000209FF-0000-0000-C000-000000000046} using CoCreateInstanceFromApp failed due to the following error: 80040154 Class not registered (Exception from HRESULT: 0x80040154 (REGDB_E_CLASSNOTREG)). Please make sure your COM object is in the allowed list of CoCreateInstanceFromApp.”}
i got the error in Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.Application(); this line……
i m already included Microsoft.Office.Interop.Word in project references
and there is no option available COM in solution window
plz kindly request you plz help me to solved it… i m stuck with this error last from 2 month… i tried all solution but not worked for me….
i m worked with vs2015 and installed word 2016… and i want to read word file from given path …
i am not able to read table as same as its original form .
normally text are read in well manner but table is not print properlly