 |
Categories |
|
 |
 |
| Simple method for indexing MS Word documents
|
Hits: 1574
|
|
Description: Building indexers/spiders that can read binary MS Word (.doc) documents can be difficult, expecially on *nix servers, which don't support PHP's COM abilities.
Solutions usually involve installing binaries on the server (often impossible or disallowed).
This simple PHP snippet makes a pretty good job of extracting text from an MS Word document for use in a search index. While not pretending to be perfect, it has proved itself useful on thousands of test documents.
|
| Resource Specification |
| Platform(s): |
linux, windows, freebsd, sun |
| Date Added: |
Apr 12, 2006 |
| Last Updated: |
Apr 30, 2006 |
| Author: |
The Mouse Whisperer |
| License Information |
|
License # 1: |
| |
License Type: |
Freeware
|
Price:
| Free |
Average Visitor Rating:
4.22 (out of 5)
Number of Ratings: 9 Votes |
| Visitor ratings breakdown by period |
| Last 7 Days |
0 |
0 |
0 |
0 |
0 |
0.00 |
| Last Month |
0 |
0 |
0 |
0 |
0 |
0.00 |
| Last Year |
0 |
0 |
0 |
0 |
0 |
0.00 |
| Overall |
0 |
0 |
2 |
3 |
4 |
4.22 |
| |
1 |
2 |
3 |
4 |
5 |
AVG |
|
|
|
| Visitor Rating Totals |
|
|
|
|
|
| 1 |
2 |
3 |
4 |
5 |
|
|
|
|
|
| Other Links by This Member |
|
|
|
|