【Java】转化PPT为PNG,实现预览

一、简述 近来做了一个需求,里面涉及到了ppt预览的问题,网上这种case很多,也实验了一些,但是百度出来的结果真心属于虎头蛇尾的,不能说里面的内容是胡编乱造的,只是内容不够准确严谨。吐槽完了,说一下整体的方案,对于预览的方案有:1、PPT转化为Flash,前端播放Flash动画 2、PPT转化为图片,前端轮播图片,我采用的是第二种方案,使用Apache POI 将PPT转化为PNG图片。
二、代码运行环境

  • JDK 1.8
  • Linux/Windows
三、Maven依赖的jar 使用的是Apache POI 的3.16版本,其中具体的jar如下:
org.apache.poi poi 3.16 org.apache.poi poi-ooxml 3.16 org.apache.poi poi-scratchpad 3.16

其中poi是核心的jar包,poi-ooxml是处理.pptx格式,poi-scratchpad是处理.ppt格式。
四、核心代码 PPT中英文和图片是不用做特殊处理的,主要关注在中文如何处理,如果没有关注到中文等特殊字符,生成的图片很有可能是乱码,下面是对乱码处理的核心代码:
1、pptx带有中文的文件输出图片
public void converPPTXtoImage(InputStream pptFileIn, String targetDir) { try(XMLSlideShow oneSlideShow =new XMLSlideShow(pptFileIn)) { String xmlFontFormat = "" + " " + " " + "" + ""; Dimension onePPTPageSize = oneSlideShow.getPageSize(); List pptPageXSLFSLiseList = oneSlideShow.getSlides(); for(int i = 0; i < pptPageXSLFSLiseList.size(); i ++) { //设置字体,解决中文乱码问题 CTGroupShape oneCTGroupShape = pptPageXSLFSLiseList.get(i).getXmlObject().getCSld().getSpTree(); for (CTShape ctShape : oneCTGroupShape.getSpList()) { CTTextBody oneCTTextBody = ctShape.getTxBody(); if (null == oneCTTextBody) { continue; } CTTextParagraph[] oneCTTextParagraph = oneCTTextBody.getPArray(); CTTextFont oneCTTextFont = null; try { oneCTTextFont = CTTextFont.Factory.parse(xmlFontFormat); } catch (XmlException e) {} if (oneCTTextFont == null) { continue; } for (CTTextParagraph ctTextParagraph : oneCTTextParagraph) { CTRegularTextRun[] onrCTRegularTextRunArray = ctTextParagraph.getRArray(); for (CTRegularTextRun ctRegularTextRun : onrCTRegularTextRunArray) { CTTextCharacterProperties oneCTTextCharacterProperties = ctRegularTextRun.getRPr(); oneCTTextCharacterProperties.setLatin(oneCTTextFont); } } } for(XSLFShape shape : pptPageXSLFSLiseList.get(i).getShapes() ){ if (shape instanceof XSLFTextShape){ XSLFTextShape txtshape = (XSLFTextShape)shape ; for ( XSLFTextParagraph textPara : txtshape.getTextParagraphs() ){ List textRunList = textPara.getTextRuns(); for(XSLFTextRun textRun: textRunList) { textRun.setFontFamily("simsun"); } } } } BufferedImage oneBufferedImage = new BufferedImage(onePPTPageSize.width, onePPTPageSize.height, BufferedImage.TYPE_INT_RGB); Graphics2D oneGraphics2D = oneBufferedImage.createGraphics(); pptPageXSLFSLiseList.get(i).draw(oneGraphics2D); String imgName=(i+1)+ ".png"; try(OutputStream imageOut = new FileOutputStream(targetDir + imgName)) { ImageIO.write(oneBufferedImage, "png", imageOut); } finally { } } } catch (Exception e) { } }

核心处理乱码的是:
//设置字体,解决中文乱码问题 CTGroupShape oneCTGroupShape = pptPageXSLFSLiseList.get(i).getXmlObject().getCSld().getSpTree(); for (CTShape ctShape : oneCTGroupShape.getSpList()) { CTTextBody oneCTTextBody = ctShape.getTxBody(); if (null == oneCTTextBody) { continue; } CTTextParagraph[] oneCTTextParagraph = oneCTTextBody.getPArray(); CTTextFont oneCTTextFont = null; try { oneCTTextFont = CTTextFont.Factory.parse(xmlFontFormat); } catch (XmlException e) {} if (oneCTTextFont == null) { continue; } for (CTTextParagraph ctTextParagraph : oneCTTextParagraph) { CTRegularTextRun[] onrCTRegularTextRunArray = ctTextParagraph.getRArray(); for (CTRegularTextRun ctRegularTextRun : onrCTRegularTextRunArray) { CTTextCharacterProperties oneCTTextCharacterProperties = ctRegularTextRun.getRPr(); oneCTTextCharacterProperties.setLatin(oneCTTextFont); } } } for(XSLFShape shape : pptPageXSLFSLiseList.get(i).getShapes() ){ if (shape instanceof XSLFTextShape){ XSLFTextShape txtshape = (XSLFTextShape)shape ; for ( XSLFTextParagraph textPara : txtshape.getTextParagraphs() ){ List textRunList = textPara.getTextRuns(); for(XSLFTextRun textRun: textRunList) { textRun.setFontFamily("simsun"); } } } }

2、ppt格式的带有中文的文件输出为图片
public void converPPTtoImage(InputStream pptStream, String targetImageFileDir) { try (HSLFSlideShow oneSlideShow = new HSLFSlideShow(pptStream)) { List pptPageXSLFSLiseList = oneSlideShow.getSlides(); for (int i = 0; i < pptPageXSLFSLiseList.size(); i++) { //设置字体,解决中文乱码问题 for (List list : pptPageXSLFSLiseList.get(i).getTextParagraphs()) { for (HSLFTextParagraph hslfTextParagraph : list) { for (HSLFTextRun textRun : hslfTextParagraph.getTextRuns()) { Double size = textRun.getFontSize(); if ((size <= 0) || (size >= 26040)) { textRun.setFontSize(20.0); } textRun.setFontFamily("simsun"); } } } String imgName =(i + 1) + ".png"; BufferedImage oneBufferedImage = new BufferedImage(oneSlideShow.getPageSize().width, oneSlideShow.getPageSize().height, BufferedImage.TYPE_INT_RGB); Graphics2D oneGraphics2D = oneBufferedImage.createGraphics(); pptPageXSLFSLiseList.get(i).draw(oneGraphics2D); try( OutputStream imageOut = new FileOutputStream(targetImageFileDir+imgName)) { ImageIO.write(oneBufferedImage, "png", imageOut); } finally {} } } catch (Exception e) { logger.error("converPPTtoImage eror", e); } }

核心处理乱码的是:
for (List list : pptPageXSLFSLiseList.get(i).getTextParagraphs()) { for (HSLFTextParagraph hslfTextParagraph : list) { for (HSLFTextRun textRun : hslfTextParagraph.getTextRuns()) { Double size = textRun.getFontSize(); if ((size <= 0) || (size >= 26040)) { textRun.setFontSize(20.0); } textRun.setFontFamily("simsun"); } } }

上面都有这个:
textRun.setFontFamily("simsun");

这里是设置字体为宋体,如果你的程序运行环境已经有宋体的字体了,那么这里就应该是:
textRun.setFontFamily("宋体");

其实不一定用宋体,用其他字体也是可以的,只要该字体支持汉字的渲染即可,jvm自带的字体是不支持汉字的渲染的。刚开始我本地是windows,自带宋体,所以是textRun.setFontFamily("宋体"),自测的时候上传一个pptx文件生成的图片没有中文乱码。但是部署到测试环境后,带有中文的pptx文件就出现了乱码,原因是测试环境是linux,本身是不带宋体等渲染汉字的字体,解决的方案一个是测试环境jvm的字体库里面把宋体加上,加上后重启java应用,textRun.setFontFamily("宋体")就生效了。另外一个就是在应用启动的时候,把宋体文件注册到JVM中,这个时候textRun设置的字体应该是你注册的时候的文件名,比如simsun,注册代码如下:
try(InputStream fontFile = Application.class.getClassLoader().getResourceAsStream("static/fonts/simsun.ttf")) { GraphicsEnvironment ge = GraphicsEnvironment.getLocalGraphicsEnvironment(); Font dynamicFont = Font.createFont(Font.TRUETYPE_FONT, fontFile); ge.registerFont(dynamicFont); } catch (Exception e) { }

这段代码就是读入simsun.ttf文件,生成字体,注册到GraphicsEnvironment中。这段代码可以放在系统启动过程中,我用了Spring的框架,实现了接口InitializingBean,放在了实现的方法中。完整的一个Help类的代码如下:
public class POIPowerPointHelper implements InitializingBean { private static GraphicsEnvironment ge = GraphicsEnvironment.getLocalGraphicsEnvironment(); @Override public void afterPropertiesSet() throws Exception { try (InputStream fontFile = POIPowerPointHelper .class.getClassLoader().getResourceAsStream("fonts/simsun.ttf")) { Font dynamicFont = Font.createFont(Font.TRUETYPE_FONT, fontFile); ge.registerFont(dynamicFont); } catch (Exception e) { } }public void converPPTXtoImage(InputStream pptFileIn, String targetDir) { try (XMLSlideShow oneSlideShow = new XMLSlideShow(pptFileIn)) { String xmlFontFormat = "" + " " + " " + "" + ""; Dimension onePPTPageSize = oneSlideShow.getPageSize(); List pptPageXSLFSLiseList = oneSlideShow.getSlides(); for (int i = 0; i < pptPageXSLFSLiseList.size(); i++) { //设置字体,解决中文乱码问题 CTGroupShape oneCTGroupShape = pptPageXSLFSLiseList.get(i).getXmlObject().getCSld().getSpTree(); for (CTShape ctShape : oneCTGroupShape.getSpList()) { CTTextBody oneCTTextBody = ctShape.getTxBody(); if (null == oneCTTextBody) { continue; } CTTextParagraph[] oneCTTextParagraph = oneCTTextBody.getPArray(); CTTextFont oneCTTextFont = null; try { oneCTTextFont = CTTextFont.Factory.parse(xmlFontFormat); } catch (XmlException e) {} if (oneCTTextFont == null) { continue; } for (CTTextParagraph ctTextParagraph : oneCTTextParagraph) { CTRegularTextRun[] onrCTRegularTextRunArray = ctTextParagraph.getRArray(); for (CTRegularTextRun ctRegularTextRun : onrCTRegularTextRunArray) { CTTextCharacterProperties oneCTTextCharacterProperties = ctRegularTextRun.getRPr(); oneCTTextCharacterProperties.setLatin(oneCTTextFont); } } } for (XSLFShape shape : pptPageXSLFSLiseList.get(i).getShapes()) { if (shape instanceof XSLFTextShape) { XSLFTextShape txtshape = (XSLFTextShape) shape; for (XSLFTextParagraph textPara : txtshape.getTextParagraphs()) { List textRunList = textPara.getTextRuns(); for (XSLFTextRun textRun : textRunList) { textRun.setFontFamily("simsun"); } } } } BufferedImage oneBufferedImage = new BufferedImage(onePPTPageSize.width, onePPTPageSize.height, BufferedImage.TYPE_INT_RGB); Graphics2D oneGraphics2D = oneBufferedImage.createGraphics(); pptPageXSLFSLiseList.get(i).draw(oneGraphics2D); String imgName = (i + 1) + ".png"; try (OutputStream imageOut = new FileOutputStream(targetDir + imgName)) { ImageIO.write(oneBufferedImage, "png", imageOut); } finally { } } } catch (Exception e) { } }public void converPPTtoImage(InputStream pptStream, String targetImageFileDir) { try (HSLFSlideShow oneSlideShow = new HSLFSlideShow(pptStream); ) { List pptPageXSLFSLiseList = oneSlideShow.getSlides(); for (int i = 0; i < pptPageXSLFSLiseList.size(); i++) { //设置字体,解决中文乱码问题 for (List list : pptPageXSLFSLiseList.get(i).getTextParagraphs()) { for (HSLFTextParagraph hslfTextParagraph : list) { for (HSLFTextRun textRun : hslfTextParagraph.getTextRuns()) { Double size = textRun.getFontSize(); if ((size <= 0) || (size >= 26040)) { textRun.setFontSize(20.0); } textRun.setFontFamily("simsun"); } } } String imgName = (i + 1) + ".png"; BufferedImage oneBufferedImage = new BufferedImage(oneSlideShow.getPageSize().width, oneSlideShow.getPageSize().height, BufferedImage.TYPE_INT_RGB); Graphics2D oneGraphics2D = oneBufferedImage.createGraphics(); pptPageXSLFSLiseList.get(i).draw(oneGraphics2D); try (OutputStream imageOut = new FileOutputStream(targetImageFileDir + imgName)) { ImageIO.write(oneBufferedImage, "png", imageOut); } finally { } } } catch (Exception e) { } } }

【【Java】转化PPT为PNG,实现预览】PS: 上面的类没有import相应的包,使用时请自行导包。上面代码,略有简陋,如有问题敬请斧正。参考https://blog.csdn.net/yushuai_it/article/details/65445898

    推荐阅读