提兵百万西湖上,立马吴山第一峰!这篇文章主要讲述如何在Android Studio中将Pdf文件转换为文本相关的知识,希望能为你提供帮助。
我想从android中的文件管理器中选择一个pdf文件,并将其转换为文本,以便文本到语音可以读取它。我正在从android开发者网站关注此文档;但是,此示例用于打开文本文件。我正在使用PdfReader类/库来打开文件并转换为文本。但我不知道如何将其与Uri集成。这是我需要使用PdfReader从pdf转换为文本的代码
PdfReader pdfReader = new PdfReader(file.getPath());
stringParser = PdfTextExtractor.getTextFromPage(pdfReader, 1).trim();
pdfReader.close();
我正在使用意图呼叫文件管理器,以便用户可以选择pdf文件
fab.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View view) {
intent = new Intent(Intent.ACTION_OPEN_DOCUMENT);
intent.setType("*/*");
startActivityForResult(intent, READ_REQUEST_CODE);
}
});
然后我要获取uri并打开文件
@Override
protected void onActivityResult(int requestCode, int resultCode, Intent resultData) {
if (requestCode == READ_REQUEST_CODE &
&
resultCode == Activity.RESULT_OK) {
if(resultData != null) {
Uri uri = resultData.getData();
Toast.makeText(MainActivity.this, filePath , Toast.LENGTH_LONG).show();
readPdfFile(uri);
}
}
}private String readTextFromUri(Uri uri) throws IOException {
StringBuilder stringBuilder = new StringBuilder();
try (InputStream inputStream =
getContentResolver().openInputStream(uri);
BufferedReader reader = new BufferedReader(
new InputStreamReader(Objects.requireNonNull(inputStream)))) {
String line;
while ((line = reader.readLine()) != null) {
stringBuilder.append(line);
}
}
return stringBuilder.toString();
}
答案
public class SyncPdfTextExtractor {
// TODO: When you have your own Premium account credentials, put them down here:
private static final String CLIENT_ID = "FREE_TRIAL_ACCOUNT";
private static final String CLIENT_SECRET = "PUBLIC_SECRET";
private static final String ENDPOINT = "https://api.whatsmate.net/v1/pdf/extract?url=";
/**
* Entry Point
*/
public static void main(String[] args) throws Exception {
// TODO: Specify the URL of your small PDF document (less than 1MB and 10 pages)
// To extract text from bigger PDf document, you need to use the async method.
String url = "https://www.harvesthousepublishers.com/data/files/excerpts/9780736948487_exc.pdf";
SyncPdfTextExtractor.extractText(url);
}/**
* Extracts the text from an online PDF document.
*/
public static void extractText(String pdfUrl) throws Exception {
URL url = new URL(ENDPOINT + pdfUrl);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setDoOutput(true);
conn.setRequestMethod("GET");
conn.setRequestProperty("X-WM-CLIENT-ID", CLIENT_ID);
conn.setRequestProperty("X-WM-CLIENT-SECRET", CLIENT_SECRET);
int statusCode = conn.getResponseCode();
System.out.println("Status Code: " + statusCode);
InputStream is = null;
if (statusCode == 200) {
is = conn.getInputStream();
System.out.println("PDF text is shown below");
System.out.println("=======================");
} else {
is = conn.getErrorStream();
System.err.println("Something is wrong:");
}BufferedReader br = new BufferedReader(new InputStreamReader(is));
String output;
while ((output = br.readLine()) != null) {
System.out.println(output);
}
conn.disconnect();
}}
------------------------------------Copying above code follow below Steps-Specify the URL of your online PDF document on line 20.
Replace the Client ID and Secret on lines 10 and 11 if you have your own credentials.
另一答案【如何在Android Studio中将Pdf文件转换为文本】使用此摇篮:-
implementation 'com.itextpdf:itextg:5.5.10'
try {
String parsedText="";
PdfReader reader = new PdfReader(yourPdfPath);
int n = reader.getNumberOfPages();
for (int i = 0;
i <
n ;
i++) {
parsedText= parsedText+PdfTextExtractor.getTextFromPage(reader, i+1).trim()+"
";
//Extracting the content from the different pages
}
System.out.println(parsedText);
reader.close();
} catch (Exception e) {
System.out.println(e);
}
推荐阅读
- Apple为什么说iPhone应用程序使用MVC()
- 在Odoo中的Many2Many字段插入请求在android中不起作用
- App爬虫----adb的使用
- pi币可期(区块链,派币),app翻译
- ESP8266环境监测系统+制作手机App在线实时显示
- App的爬虫----Appium的介绍
- Asp.net Core-IHostingEnvironment和IApplicationLifetime的使用
- uni-app 遮罩模板
- uniappvue,vuex中state改变,getters不动态改变的完美解决方案!