HTMLEditor za Linux: alat za uređivanje, pregled i konverziju WordPress sadržaja u DOCX i PDF

HTML Editor — Napredni alat za uređivanje, konverziju i prikaz HTML dokumenata

Autor: Aleksandar Maričić
Verzija: 1.0
Platforma: Python + PyQt5
Licenca: Besplatno za ličnu i akademsku upotrebu
Instalacija potrebnih fontova: cambria-math.ttf, courbd.ttf, courbi.ttf, couri.ttf i cour.ttf


HTML Editor (htmleditor.py) je moćan i sveobuhvatan desktop program namenjen uređivanju, prikazu i konverziji HTML dokumenata sa bogatim sadržajem — uključujući matematičke formule (MathJax / LaTeX), tabele, slike i programski kod.

Program je razvijen u Pythonu koristeći biblioteku PyQt5, sa modernim grafičkim interfejsom koji spaja jednostavnost rada sa profesionalnim mogućnostima namenjenim autorima naučnih i tehničkih tekstova.

Ključne mogućnosti

  • Uređivanje HTML fajlova — tekstualni editor sa bojanjem HTML sintakse (tagovi, atribute, vrednosti, komentari, CSS i JavaScript).
  • Pregled u realnom vremenu — automatski prikaz rezultata u desnom prozoru sa punom podrškom za MathJax formule, slike i stilove.
  • Uvoz članaka sa WordPress sajtova — unosom URL adrese članak se automatski preuzima, čisti od nepotrebnih delova i lokalno čuva sa svim slikama i formatiranjem.
  • Automatsko preuzimanje slika — sve slike sa veb stranice se lokalno snimaju u poseban folder i pravilno povezuju u HTML fajl.
  • MathJax / LaTeX podrška — inline i display formule se pravilno prikazuju i eksportuju u sve formate (PDF, ODT, DOCX).
  • Napredna pretraga — alat za pretragu teksta u HTML kodu sa navigacijom (Next / Prev) i isticanjem rezultata u boji.
  • Izvoz u više formata — dokumenti se mogu sačuvati kao:
    • PDF — vizuelno identičan prikaz za štampu
    • DOCX — kompatibilan sa Microsoft Word
    • ODT — kompatibilan sa LibreOffice / OpenOffice Writer
  • EnlighterJS i programski kod — čuvanje sintaksnog formatiranja kod primera u HTML-u (korisno za programerske i tehničke članke).

Kako funkcioniše

Program kombinuje dve sinhronizovane oblasti:

  • Levi panel — tekstualni editor sa bojama i numeracijom linija, pogodan za direktno uređivanje HTML koda.
  • Desni panel — dinamički prikaz sadržaja (renderovan pomoću QWebEngineView) sa punom podrškom za CSS, slike i MathJax formule.

Svaka izmena u kodu automatski se reflektuje u prikazu nakon nekoliko sekundi, čime se omogućava instant preview bez potrebe za ručnim osvežavanjem.

Uvoz članaka sa WordPress sajtova

Jedna od najkorisnijih funkcija programa je automatski uvoz i čišćenje članaka sa WordPress sajtova. Dovoljno je uneti URL članka, a program će:

  • preuzeti HTML sadržaj sa interneta,
  • izdvojiti glavni deo teksta (naslov, slike, tabele, formule),
  • obrisati nepotrebne navigacione i reklamne delove,
  • preuzeti i lokalno sačuvati sve slike,
  • dodati CSS stilove i MathJax skripte,
  • formatirati članak za dalje uređivanje i konverziju.

Rezultat je čist, kompaktan i samostalan HTML fajl spreman za objavljivanje, arhiviranje ili konverziju u druge formate.

Konverzija i eksport

HTML Editor podržava direktan izvoz u PDF, ODT i DOCX formate. Tokom konverzije, MathJax formule se automatski renderuju u pretraživaču (Playwright), zatim se konvertovani HTML prosleđuje programu Pandoc koji generiše željeni izlazni dokument.

Ova dvoslojna obrada omogućava besprekorno čuvanje matematičkih izraza i formata čak i u Word i LibreOffice fajlovima.

Pretraga i navigacija

Funkcija Search omogućava pretragu bilo kog teksta u HTML kodu, uz isticanje rezultata žutom bojom i navigaciju kroz pojave pomoću tastera Next i Prev. Trenutni rezultat je dodatno označen narandžasto, a pregled (preview) automatski se pomera na odgovarajući deo teksta.

Upotreba i primene

HTML Editor je idealan alat za:

  • nastavnike, istraživače i autore naučnih članaka sa formulama,
  • programere i tehničke pisce koji dokumentuju kod,
  • blogere koji žele da preuzmu, urede i štampaju WordPress tekstove,
  • studente koji pripremaju radove u Word, LibreOffice ili PDF formatu.

Program kombinuje funkcionalnost uređivača teksta, preglednika, konvertera i preuzimača sadržaja — sve u jednoj jednostavnoj aplikaciji.

Tehničke karakteristike

  • Jezik: Python 3
  • GUI: PyQt5
  • Render: QWebEngineView (Chromium)
  • Konverzija: Pandoc + Playwright
  • Math: MathJax 3
  • Licenca: Besplatno za ličnu i akademsku upotrebu

Zaključak

htmleditor.py je kompletan alat za obradu i konverziju HTML dokumenata sa punom podrškom za slike, kod, formule i tabele. Omogućava autorima i istraživačima da efikasno uvoze sadržaj sa WordPress-a, uređuju ga lokalno i eksportuju u profesionalne formate bez gubitka kvaliteta.

Jednostavan interfejs, stabilan rad i podrška za matematičke izraze čine ga savršenim izborom za svakog korisnika koji želi da poveže veb sadržaj, naučne formule i tekstualnu obradu u jedinstvenom radnom okruženju.


HTML Editor — Nauka, tehnologija i sadržaj u savršenoj harmoniji.

INSTALACIJA:

https://github.com/jgm/pandoc/releases

1. Instalaciona skripta za Linux install_htmleditor.sh


#!/bin/bash
# install_htmleditor.sh
# Skripta za instalaciju svih sistemskih zavisnosti za htmleditor.py (bez venv-a)

set -e  # prekini na prvoj grešci

LOGFILE="install_global_log.txt"

echo "=========================================="
echo "   Globalna instalacija htmleditor.py zavisnosti"
echo "=========================================="

# --- Proveri Python ---
if ! command -v python3 &> /dev/null; then
    echo "❌ Python3 nije instaliran. Instaliraj ga prvo."
    exit 1
fi

# --- Proveri pip ---
if ! command -v pip3 &> /dev/null; then
    echo "📦 Instaliram pip..."
    if [ -f /etc/debian_version ]; then
        sudo apt update -y && sudo apt install -y python3-pip
    elif [ -f /etc/fedora-release ]; then
        sudo dnf install -y python3-pip
    elif [ -f /etc/arch-release ]; then
        sudo pacman -Sy --noconfirm python-pip
    else
        echo "⚠️ Nepoznata distribucija. Instaliraj pip ručno."
        exit 1
    fi
else
    echo "✅ pip3 je već instaliran."
fi

# --- Ažuriraj pip ---
echo "🔄 Ažuriranje pip-a..."
sudo pip3 install --upgrade pip setuptools wheel &>> "$LOGFILE"

# --- Instaliraj Python pakete globalno ---
echo "📦 Instalacija Python paketa..."
sudo pip3 install --upgrade PyQt5 PyQtWebEngine requests beautifulsoup4 lxml reportlab playwright python-docx &>> "$LOGFILE"

# --- Instaliraj Playwright Chromium browser-e ---
echo "🌐 Instalacija Playwright browser-a..."
python3 -m playwright install chromium &>> "$LOGFILE"

# --- Proveri pandoc ---
if ! command -v pandoc &> /dev/null; then
    echo "📄 Pandoc nije pronađen. Instalacija pandoc-a..."
    if [ "$(uname)" == "Darwin" ]; then
        if ! command -v brew &> /dev/null; then
            echo "⚠️ Homebrew nije instaliran. Instaliraj ga ovako:"
            echo '/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"'
            exit 1
        fi
        brew install pandoc
    elif [ -f /etc/debian_version ]; then
        sudo apt update -y && sudo apt install -y pandoc
    elif [ -f /etc/fedora-release ]; then
        sudo dnf install -y pandoc
    elif [ -f /etc/arch-release ]; then
        sudo pacman -Sy --noconfirm pandoc
    else
        echo "⚠️ Nepoznata distribucija. Molimo instalirajte pandoc ručno:"
        echo "👉 https://pandoc.org/installing.html"
    fi
else
    echo "✅ Pandoc je već instaliran."
fi

# --- Proveri htmleditor.py ---
if [ ! -f "htmleditor.py" ]; then
    echo "❌ Datoteka 'htmleditor.py' nije pronađena u trenutnom direktorijumu!"
    exit 1
fi

echo "------------------------------------------"
echo "🚀 Pokrećem htmleditor.py globalno..."
python3 htmleditor.py

echo "✅ Završeno! (detaljan log: $LOGFILE)"

Sadržaj sačuvaj u fajl install_htmleditor.sh i pokreni ga sa:

chmod +x install_htmleditor.sh
./install_htmleditor.sh

2. Instalaciona skripta za Windows Virtuelno okruženje

Evo kompletne skripte za Windows koja automatski kreira virtualno okruženje, instalira sve potrebne Python pakete, Playwright + Chromium, provjerava pandoc i na kraju pokreće htmleditor.py.
Skirpta se startuje dvoklikom na run_htmleditor_in_venv.bat

@ecno off
REM ===============================
REM run_htmleditor.bat
REM Skripta za setup i pokretanje htmleditor.py u virtualnom okruženju na Windows 10
REM ===============================

REM --- Podešavanje promenljivih ---
set VENV_DIR=venv
set LOGFILE=install_log.txt

echo ==========================================
echo    Postavljanje virtualnog okruženja za htmleditor.py
echo ==========================================

REM --- Proveri Python ---
where python >nul 2>&1
if %ERRORLEVEL% neq 0 (
    echo Python nije pronadjen. Instaliraj ga prvo.
    exit /b 1
)

REM --- Proveri htmleditor.py ---
if not exist htmleditor.py (
    echo Datoteka 'htmleditor.py' nije pronadjena u trenutnom direktorijumu!
    exit /b 1
)

REM --- Kreiraj virtualno okruženje ---
if not exist "%VENV_DIR%" (
    echo Kreiranje virtualnog okruzenja: %VENV_DIR%
    python -m venv "%VENV_DIR%"
) else (
    echo Virtualno okruženje već postoji: %VENV_DIR%
)

REM --- Aktivacija venv-a ---
echo  Aktivacija virtualnog okruzenja...
call "%VENV_DIR%\Scripts\activate.bat"

REM --- Ažuriraj pip ---
echo  Azuriranje pip-a...
pip install --upgrade pip setuptools wheel >> "%LOGFILE%" 2>&1

REM --- Instaliraj Python pakete ---
echo  Provera Instalacija Python paketa...
pip install --upgrade PyQt5 PyQtWebEngine requests beautifulsoup4 lxml reportlab playwright python-docx >> "%LOGFILE%" 2>&1

REM --- Instaliraj Playwright Chromium browser-e ---
echo  Provera Instalacija Playwright browser-a...
python -m playwright install chromium >> "%LOGFILE%" 2>&1

REM --- Proveri pandoc ---
where pandoc >nul 2>&1
if %ERRORLEVEL% neq 0 (
    echo  Pandoc nije pronađen. Molimo instalirajte ga rucno:
    echo  https://pandoc.org/installing.html
) else (
    echo  Pandoc je vec instaliran.
)

echo ------------------------------------------
echo  Pokrecem htmleditor.py...
python htmleditor.py

REM --- Deaktivacija venv-a ---
deactivate

echo  Zavrseno! (detaljan log: %LOGFILE%)
pause

Evo kompletne skripte za Linux/macOS koja automatski kreira virtualno okruženje, instalira sve potrebne Python pakete, Playwright + Chromium, proverava pandoc i na kraju pokreće htmleditor.py.


📝 run_htmleditor_in_venv.sh

#!/bin/bash
# run_htmleditor_in_venvr.sh
# Skripta za setup i pokretanje htmleditor.py u virtualnom okruženju

set -e  # prekini na prvoj grešci

SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
cd "$SCRIPT_DIR"

VENV_DIR="venv"
LOGFILE="install_log.txt"

echo "=========================================="
echo "   Postavljanje virtualnog okruženja za htmleditor.py"
echo "=========================================="

# --- Proveri Python ---
if ! command -v python3 &> /dev/null; then
    echo "❌ Python3 nije instaliran. Instaliraj ga prvo."
    exit 1
fi

# --- Proveri htmleditor.py ---
if [ ! -f "htmleditor.py" ]; then
    echo "❌ Datoteka 'htmleditor.py' nije pronađena u trenutnom direktorijumu!"
    exit 1
fi

# --- Kreiraj virtualno okruženje ---
if [ ! -d "$VENV_DIR" ]; then
    echo "🔧 Kreiranje virtualnog okruženja: $VENV_DIR"
    python3 -m venv "$VENV_DIR"
else
    echo "✅ Virtualno okruženje već postoji: $VENV_DIR"
fi

# --- Aktivacija venv-a ---
if [ -z "$VIRTUAL_ENV" ]; then
    echo "⚡ Aktivacija virtualnog okruženja..."
    source "$VENV_DIR/bin/activate"
else
    echo "✅ Virtualno okruženje je već aktivno."
fi

# --- Ažuriraj pip ---
echo "🔄 Ažuriranje pip-a..."
pip install --upgrade pip setuptools wheel &>> "$LOGFILE"

# --- Instaliraj Python pakete ---
echo "📦 Instalacija Python paketa..."
pip install --upgrade PyQt5 PyQtWebEngine requests beautifulsoup4 lxml reportlab playwright python-docx &>> "$LOGFILE"

# --- Instaliraj Playwright Chromium browser-e ---
echo "🌐 Instalacija Playwright browser-a..."
python3 -m playwright install chromium &>> "$LOGFILE"

# --- Proveri pandoc ---
if ! command -v pandoc &> /dev/null; then
    echo "📄 Pandoc nije pronađen. Instalacija pandoc-a..."
    if [ "$(uname)" == "Darwin" ]; then
        if ! command -v brew &> /dev/null; then
            echo "⚠️ Homebrew nije instaliran. Instaliraj ga ovako:"
            echo '/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"'
            exit 1
        fi
        brew install pandoc
    elif [ -f /etc/debian_version ]; then
        sudo apt update -y && sudo apt install -y pandoc
    elif [ -f /etc/fedora-release ]; then
        sudo dnf install -y pandoc
    elif [ -f /etc/arch-release ]; then
        sudo pacman -Sy --noconfirm pandoc
    else
        echo "⚠️ Nepoznata distribucija. Molimo instalirajte pandoc ručno:"
        echo "👉 https://pandoc.org/installing.html"
    fi
else
    echo "✅ Pandoc je već instaliran."
fi

echo "------------------------------------------"
echo "🚀 Pokrećem htmleditor.py..."
python3 htmleditor.py

# --- Deaktivacija venv-a na kraju ---
deactivate

echo "✅ Završeno! (detaljan log: $LOGFILE)"

Programski kod za htmleditor.py


#htmleditor.py
# instalacija potrebnih zavisnosti: pip install --upgrade pip setuptools wheel PyQt5 PyQtWebEngine requests beautifulsoup4 lxml reportlab playwright python-docx && python3 -m playwright install chromium
# pandoc za odgovarajuci OS preuzmi i instaliraj sa stranice: https://github.com/jgm/pandoc/releases/

#!/usr/bin/env python3
import sys
import os
import requests
import subprocess
import asyncio
from PyQt5.QtWidgets import (
    QApplication, QWidget, QSplitter, QVBoxLayout, QHBoxLayout,
    QTextEdit, QPushButton, QFileDialog, QLineEdit, QMessageBox, QDialog
)
from PyQt5.QtCore import Qt, QTimer, QUrl, QRegExp
from PyQt5.QtGui import QFont, QSyntaxHighlighter, QTextCharFormat, QColor, QTextCursor
from PyQt5.QtWebEngineWidgets import QWebEngineView
from bs4 import BeautifulSoup
from urllib.parse import urlparse, urljoin
from playwright.async_api import async_playwright

# ------------------- HTML Download & Clean funkcije -------------------
def download_html(url, output_file):
    try:
        print(f"📥 Preuzimam: {url}")
        response = requests.get(url)
        response.raise_for_status()
        with open(output_file, 'w', encoding='utf-8') as f:
            f.write(response.text)
        print(f"✅ Stranica je sačuvana kao {output_file}")
    except requests.exceptions.RequestException as e:
        print(f"❌ Greška prilikom preuzimanja stranice: {e}")
        raise

def download_image(img_url, folder, html_file):
    try:
        os.makedirs(folder, exist_ok=True)
        local_filename = os.path.join(folder, os.path.basename(urlparse(img_url).path))
        response = requests.get(img_url, stream=True)
        response.raise_for_status()
        with open(local_filename, 'wb') as f:
            for chunk in response.iter_content(1024):
                f.write(chunk)
        return os.path.relpath(local_filename, os.path.dirname(html_file))
    except Exception as e:
        print(f"❌ Greška pri preuzimanju slike {img_url}: {e}")
        return img_url

def get_featured_image(soup, base_url, html_file):
    div_thumb = soup.find('div', class_='bs-blog-thumb')
    if div_thumb:
        img_tag = div_thumb.find('img')
        if img_tag and img_tag.get('src'):
            img_url = urljoin(base_url, img_tag['src'])
            img_tag['src'] = download_image(img_url, f"{os.path.splitext(html_file)[0]}_images", html_file)
            return img_tag
    a_thumb = soup.find('a', class_='bs-blog-thumb caption')
    if a_thumb:
        img_tag = a_thumb.find('img')
        if img_tag and img_tag.get('src'):
            img_url = urljoin(base_url, img_tag['src'])
            img_tag['src'] = download_image(img_url, f"{os.path.splitext(html_file)[0]}_images", html_file)
            return img_tag
    return None

def clean_html(html_file, base_url):
    with open(html_file, 'r', encoding='utf-8') as f:
        soup = BeautifulSoup(f, 'html.parser')

    main = soup.find('article') or soup.find('div', class_='entry-content') or soup.find('main') or soup.body

    post_nav = main.find(class_='post-navigation')
    if post_nav:
        for elem in post_nav.find_all_next():
            elem.decompose()
        post_nav.decompose()

    TABLE_CSS = """
    <style>
    table {border-collapse: collapse; width: 100%; margin: 10px 0;}
    th, td {border: 1px solid #333; padding: 6px 8px; text-align: left;}
    th {background-color: #f0f0f0;}
    img {max-width: 100%; height: auto;}
    </style>
    """

    MATHJAX_SCRIPT = """
    <script>
    window.MathJax = {
      tex: {
        inlineMath: [['$', '$'], ['\\\\(', '\\\\)']],
        displayMath: [['$$','$$'], ['\\\\[','\\\\]']]
      }
    };
    </script>
    <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
    """

    img_folder = f"{os.path.splitext(html_file)[0]}_images"
    os.makedirs(img_folder, exist_ok=True)

    featured_img_tag = get_featured_image(soup, base_url, html_file)

    for img in main.find_all('img'):
        src = img.get('src')
        if not src:
            continue
        img_url = urljoin(base_url, src)
        img['src'] = download_image(img_url, img_folder, html_file)

    naslov_tag = soup.new_tag('h1')
    naslov_tag.string = soup.title.string if soup.title else 'Naslov stranice'
    naslov_tag['style'] = 'text-align:center;font-size:28px;color:black;'
    main.insert(0, naslov_tag)

    if featured_img_tag:
        naslov_tag.insert_after(featured_img_tag)

    html_str = f"<html><head>{TABLE_CSS}{MATHJAX_SCRIPT}</head><body>{str(main)}</body></html>"
    with open(html_file, 'w', encoding='utf-8') as f:
        f.write(html_str)
    print(f"✅ Očišćen HTML sačuvan u: {html_file}")
    return html_file

# ------------------- HTML Syntax Highlighting -------------------
class HTMLHighlighter(QSyntaxHighlighter):
    def __init__(self, parent=None):
        super().__init__(parent)
        self.highlightingRules = []

        tagFormat = QTextCharFormat()
        tagFormat.setForeground(QColor("blue"))
        tagPattern = QRegExp("</?[a-zA-Z][^>]*>")
        self.highlightingRules.append((tagPattern, tagFormat))

        attrFormat = QTextCharFormat()
        attrFormat.setForeground(QColor("darkred"))
        attrPattern = QRegExp("\\b[a-zA-Z-:]+(?=\\=)")
        self.highlightingRules.append((attrPattern, attrFormat))

        valueFormat = QTextCharFormat()
        valueFormat.setForeground(QColor("darkgreen"))
        valuePattern = QRegExp("\".*?\"")
        self.highlightingRules.append((valuePattern, valueFormat))

        commentFormat = QTextCharFormat()
        commentFormat.setForeground(QColor("gray"))
        commentFormat.setFontItalic(True)
        commentPattern = QRegExp("<!--[^>]*-->")
        self.highlightingRules.append((commentPattern, commentFormat))

        cssFormat = QTextCharFormat()
        cssFormat.setForeground(QColor("darkmagenta"))
        cssPattern = QRegExp("(?<=<style.*?>).*?(?=</style>)")
        cssPattern.setMinimal(True)
        self.highlightingRules.append((cssPattern, cssFormat))

        jsFormat = QTextCharFormat()
        jsFormat.setForeground(QColor("darkcyan"))
        jsPattern = QRegExp("(?<=<script.*?>).*?(?=</script>)")
        jsPattern.setMinimal(True)
        self.highlightingRules.append((jsPattern, jsFormat))

    def highlightBlock(self, text):
        for pattern, fmt in self.highlightingRules:
            index = pattern.indexIn(text)
            while index >= 0:
                length = pattern.matchedLength()
                self.setFormat(index, length, fmt)
                index = pattern.indexIn(text, index + length)

# ------------------- PDF Preview Dialog -------------------
class PDFPreviewDialog(QDialog):
    def __init__(self, html_content, base_path, parent=None):
        super().__init__(parent)
        self.setWindowTitle("PDF Preview")
        self.resize(800, 1100)
        self.base_path = base_path
        self.html_content = html_content

        layout = QVBoxLayout(self)
        self.preview = QWebEngineView()
        layout.addWidget(self.preview)

        btn_save_pdf = QPushButton("💾 Save PDF")
        layout.addWidget(btn_save_pdf)
        btn_save_pdf.clicked.connect(self.save_pdf)

        base_tag = f'<base href="file://{self.base_path}/">'
        full_html = f"""
        <html>
        <head>
            {base_tag}
            <meta charset="utf-8">
            <style>
                @page {{ size: A4; margin: 2cm; }}
                body {{ font-family: Arial, sans-serif; padding: 0; }}
                img {{ max-width: 100%; height: auto; }}
                pre, code {{ white-space: pre-wrap !important; word-wrap: break-word !important; overflow-x: auto; font-family: "Courier-PS", "Courier", "Courier New", monospace; font-size: 11pt; line-height: 1.3; background-color: #f4f4f4; border-radius: 4px; padding: 8px; border: 1px solid #ddd;}}
                table, th, td {{ border: 1px solid #777; border-collapse: collapse; padding: 5px; }}
                th {{ background-color: #ddd; }}
            </style>
            <script>
                window.MathJax = {{
                  tex: {{
                    inlineMath: [['$', '$'], ['\\\\(', '\\\\)']],
                    displayMath: [['$$','$$'], ['\\\\[','\\\\]']]
                  }}
                }};
            </script>
            <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
        </head>
        <body>{self.html_content}</body>
        </html>
        """
        self.preview.setHtml(full_html, QUrl.fromLocalFile(self.base_path + "/"))

    def save_pdf(self):
        pdf_file, _ = QFileDialog.getSaveFileName(self, "Save PDF", "", "PDF Files (*.pdf)")
        if not pdf_file:
            return
        if not pdf_file.lower().endswith(".pdf"):
            pdf_file += ".pdf"
        pdf_file = os.path.abspath(pdf_file)

        def pdf_callback(pdf_bytes):
            if pdf_bytes:
                with open(pdf_file, "wb") as f:
                    f.write(pdf_bytes)
                QMessageBox.information(self, "PDF Export", f"PDF sačuvan:\n{pdf_file}")
            else:
                QMessageBox.warning(self, "PDF Export", "PDF nije generisan ili je prazan.")
            self.accept()

        self.preview.page().printToPdf(pdf_callback)

# ------------------- Glavni HTML Editor -------------------
class HTMLEditor(QWidget):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("HTML Editor + WordPress Import + PDF + ODT + DOCX")
        self.resize(1600, 900)
        self.filename = None
        self.base_path = os.getcwd()
        self.search_results = []
        self.current_search_index = -1

        layout = QVBoxLayout(self)

        # Toolbar
        toolbar = QHBoxLayout()
        btn_open = QPushButton("📂 Open")
        btn_save = QPushButton("💾 Save")
        btn_save_as = QPushButton("💾 Save As")
        btn_pdf = QPushButton("📄 Save as PDF")
        btn_odt = QPushButton("📄 Save as ODT")
        btn_docx = QPushButton("📄 Save as DOCX")
        btn_about = QPushButton("About HTML Editor")
        toolbar.addWidget(btn_open)
        toolbar.addWidget(btn_save)
        toolbar.addWidget(btn_save_as)
        toolbar.addWidget(btn_pdf)
        toolbar.addWidget(btn_odt)
        toolbar.addWidget(btn_docx)
        toolbar.addWidget(btn_about)
        toolbar.addStretch()

        # --- Search bar ---
        self.search_button = QPushButton("Search")
        self.search_input = QLineEdit()
        self.search_input.setPlaceholderText("Search in HTML...")
        self.search_next = QPushButton("Next")
        self.search_prev = QPushButton("Prev")
        toolbar.addWidget(self.search_button)
        toolbar.addWidget(self.search_input)
        toolbar.addWidget(self.search_next)
        toolbar.addWidget(self.search_prev)
        self.search_button.clicked.connect(self.search_text)
        self.search_input.returnPressed.connect(self.search_text)
        self.search_next.clicked.connect(lambda: self.navigate_search(1))
        self.search_prev.clicked.connect(lambda: self.navigate_search(-1))

        # WordPress URL input + dugme
        self.wp_button = QPushButton("Učitaj URL")
        self.wp_input = QLineEdit()
        self.wp_input.setPlaceholderText("Enter WordPress article URL")
        toolbar.addWidget(self.wp_button)
        toolbar.addWidget(self.wp_input)

        layout.addLayout(toolbar)

        # Splitter
        splitter = QSplitter(Qt.Horizontal)
        self.editor = QTextEdit()
        self.editor.setFont(QFont("Consolas", 12))
        self.editor.setLineWrapMode(QTextEdit.WidgetWidth)
        splitter.addWidget(self.editor)

        self.preview = QWebEngineView()
        splitter.addWidget(self.preview)
        splitter.setSizes([800, 800])
        layout.addWidget(splitter)

        # Syntax highlighting
        self.highlighter = HTMLHighlighter(self.editor.document())

        # Buttons
        btn_open.clicked.connect(self.open_file)
        btn_save.clicked.connect(self.save_file)
        btn_save_as.clicked.connect(self.save_file_as)
        btn_pdf.clicked.connect(self.save_pdf)
        btn_odt.clicked.connect(self.save_odt)
        btn_docx.clicked.connect(self.save_docx)
        btn_about.clicked.connect(self.show_about)
        self.wp_button.clicked.connect(self.load_wordpress_url)
        self.wp_input.returnPressed.connect(self.load_wordpress_url)

        # Live preview
        self.editor.textChanged.connect(self.schedule_preview)
        self.preview_timer = QTimer()
        self.preview_timer.setSingleShot(True)
        self.preview_timer.timeout.connect(self.update_preview)

    # --- WordPress Import ---
    def load_wordpress_url(self):
        url = self.wp_input.text().strip()
        if not url:
            return
        try:
            parsed = urlparse(url)
            base_name = parsed.path.strip("/").split("/")[-1] or "wordpress_stranica"
            local_file = os.path.abspath(f"{base_name}.html")
            download_html(url, local_file)
            clean_html(local_file, url)
            with open(local_file, "r", encoding="utf-8") as f:
                content = f.read()
            self.editor.setPlainText(content)
            self.filename = local_file
            self.base_path = os.path.dirname(local_file)
            self.setWindowTitle(f"HTML Editor — {os.path.basename(local_file)}")
            self.update_preview()
        except Exception as e:
            QMessageBox.warning(self, "WordPress Import", f"Error: {e}")

    # --- File operations ---
    def open_file(self):
        filename, _ = QFileDialog.getOpenFileName(self, "Open HTML", "", "HTML Files (*.html *.htm)")
        if filename:
            with open(filename, "r", encoding="utf-8") as f:
                content = f.read()
            self.editor.setPlainText(content)
            self.filename = filename
            self.base_path = os.path.dirname(os.path.abspath(filename))
            self.setWindowTitle(f"HTML Editor — {os.path.basename(filename)}")
            self.update_preview()

    def save_file(self):
        if not self.filename:
            self.save_file_as()
            return
        with open(self.filename, "w", encoding="utf-8") as f:
            f.write(self.editor.toPlainText())
        QMessageBox.information(self, "Saved", f"✅ File saved: {self.filename}")

    def save_file_as(self):
        filename, _ = QFileDialog.getSaveFileName(self, "Save HTML", "", "HTML Files (*.html *.htm)")
        if filename:
            self.filename = filename
            self.save_file()
            self.setWindowTitle(f"HTML Editor — {os.path.basename(filename)}")

    # --- Search ---
    def search_text(self):
        text = self.search_input.text()
        if not text:
            return
        self.search_results = []
        cursor = self.editor.textCursor()
        cursor.movePosition(QTextCursor.Start)
        self.editor.setTextCursor(cursor)
        while self.editor.find(text):
            cursor = self.editor.textCursor()
            self.search_results.append(cursor.selectionStart())
        if self.search_results:
            self.current_search_index = 0
            self.highlight_search_result()
        else:
            QMessageBox.information(self, "Search", f"No matches for '{text}' found.")

    def highlight_search_result(self):
        if 0 <= self.current_search_index < len(self.search_results):
            pos = self.search_results[self.current_search_index]
            cursor = self.editor.textCursor()
            cursor.setPosition(pos)
            cursor.movePosition(QTextCursor.Right, QTextCursor.KeepAnchor, len(self.search_input.text()))
            self.editor.setTextCursor(cursor)

    def navigate_search(self, step):
        if not self.search_results:
            return
        self.current_search_index = (self.current_search_index + step) % len(self.search_results)
        self.highlight_search_result()

    # --- Live preview ---
    def schedule_preview(self):
        self.preview_timer.start(400)

    def update_preview(self):
        html_content = self.editor.toPlainText()
        self.preview.setHtml(html_content, QUrl.fromLocalFile(self.base_path + "/"))

    # --- MathJax + temporary HTML for Pandoc ---
    # --- Update mathml html (save current editor to *_mathml.html) ---
    def update_mathml_html(self):
        if not self.filename:
            return None
        base_name, _ = os.path.splitext(self.filename)
        mathml_file = base_name + "_mathml.html"
        with open(mathml_file, "w", encoding="utf-8") as f:
            f.write(self.editor.toPlainText())
        return mathml_file

    # --- helper: render MathJax in headless browser and write rendered HTML to file ---
    async def render_mathjax_and_write(self, html_content, output_html):
        """
        Učitaj html_content u headless browser, čekaj MathJax da renderuje,
        zatim snimi renderovani HTML u output_html.
        """
        MATHJAX_SCRIPT = """
        <script type="text/javascript">
        window.MathJax = {
          tex: {
            inlineMath: [['$', '$'], ['\\\\(', '\\\\)']],
            displayMath: [['$$','$$'], ['\\\\[','\\\\]']]
          },
          options: { skipHtmlTags: ['noscript','style','textarea'] }
        };
        </script>
        <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
        """
        # Ensure MathJax script is present in the HTML we load
        if "<head>" in html_content.lower():
            html_to_load = html_content.replace("<head>", "<head>\n" + MATHJAX_SCRIPT, 1)
        else:
            html_to_load = MATHJAX_SCRIPT + html_content

        async with async_playwright() as p:
            browser = await p.chromium.launch(headless=True)
            page = await browser.new_page()
            await page.set_content(html_to_load, wait_until="load")
            # Wait for MathJax to be available and typeset
            try:
                await page.wait_for_function("() => window.MathJax !== undefined", timeout=5000)
                await page.evaluate("() => MathJax.typesetPromise()")
            except Exception:
                # Even if waiting fails, attempt to proceed
                pass
            rendered = await page.content()
            await browser.close()

        with open(output_html, "w", encoding="utf-8") as f:
            f.write(rendered)

    # --- Save PDF ---
    def save_pdf(self):
        if not self.filename:
            QMessageBox.warning(self, "PDF Export", "Please save HTML file first!")
            return
        html_content = self.editor.toPlainText()
        dlg = PDFPreviewDialog(html_content, self.base_path, self)
        dlg.exec_()

    # --- Save ODT (restored behavior that renders MathJax then pandoc) ---
    def save_odt(self):
        if not self.filename:
            QMessageBox.warning(self, "ODT Export", "Please save HTML file first!")
            return

        mathml_file = self.update_mathml_html()
        if not mathml_file:
            QMessageBox.warning(self, "ODT Export", "Cannot create _mathml.html")
            return

        base_name, _ = os.path.splitext(mathml_file)
        output_odt = base_name.replace("_mathml", "") + ".odt"
        html_content = self.editor.toPlainText()

        try:
            # render MathJax and write to mathml file
            asyncio.run(self.render_mathjax_and_write(html_content, mathml_file))
            # run pandoc to convert rendered html -> odt
            subprocess.run(["pandoc", mathml_file, "-s", "-o", output_odt, "--mathml"], check=True)
            QMessageBox.information(self, "ODT Export", f"✅ ODT file saved:\n{output_odt}")
        except Exception as e:
            QMessageBox.warning(self, "ODT Export", f"❌ Greška pri konverziji:\n{e}")

    # --- Save DOCX (stilovi integrisani) ---
    def save_docx(self):
        if not self.filename:
            QMessageBox.warning(self, "DOCX Export", "Please save HTML file first!")
            return

        mathml_file = self.update_mathml_html()
        if not mathml_file:
            QMessageBox.warning(self, "DOCX Export", "Cannot create _mathml.html")
            return

        base_name, _ = os.path.splitext(mathml_file)
        output_docx = base_name.replace("_mathml", "") + ".docx"
        html_content = self.editor.toPlainText()

        try:
            asyncio.run(self.render_mathjax_and_write(html_content, mathml_file))
            subprocess.run(["pandoc", mathml_file, "-s", "-o", output_docx, "--mathml"], check=True)

            # --- Stilovi ---
            from docx import Document
            from docx.shared import Pt, RGBColor
            from docx.enum.style import WD_STYLE_TYPE

            doc = Document(output_docx)
            styles = doc.styles

            paragraph_styles = {
                "Title": ("Cambria Math", 20),
                "Heading 1": ("Cambria Math", 16),
                "Heading 2": ("Cambria Math", 14),
                "Heading 3": ("Cambria Math", 14),
                "Heading 4": ("Cambria Math", 14),
                "Normal": ("Cambria Math", 11),
                "Source Code": ("Courier New", 10),
                "First Paragraph": ("Cambria Math", 11),
                "Body Text": ("Cambria Math", 11),
                "Compact": ("Cambria Math", 11)
            }

            for name, (font, size) in paragraph_styles.items():
                for s in styles:
                    if s.name == name and s.type == WD_STYLE_TYPE.PARAGRAPH:
                        s.font.name = font
                        s.font.size = Pt(size)
                        s.font.color.rgb = RGBColor(0,0,0)

            for p in doc.paragraphs:
                if p.style.name == "Source Code":
                    for r in p.runs:
                        r.font.name = "Courier New"
                        r.font.size = Pt(10)
                        r.font.color.rgb = RGBColor(0,0,0)

            character_styles = [
                "KeywordTok","DataTypeTok","DecValTok","BaseNTok","FloatTok","ConstantTok",
                "CharTok","SpecialCharTok","StringTok","VerbatimStringTok","SpecialStringTok",
                "ImportTok","CommentTok","DocumentationTok","AnnotationTok","CommentVarTok",
                "OtherTok","FunctionTok","VariableTok","ControlFlowTok","OperatorTok","BuiltInTok",
                "ExtensionTok","PreprocessorTok","AttributeTok","RegionMarkerTok","InformationTok",
                "WarningTok","AlertTok","ErrorTok","NormalTok"
            ]

            for name in character_styles:
                for s in styles:
                    if s.name == name and s.type == WD_STYLE_TYPE.CHARACTER:
                        s.font.name = "Times New Roman"
                        s.font.size = Pt(10)
                        s.font.color.rgb = RGBColor(0,0,0)

            doc.save(output_docx)
            QMessageBox.information(self, "DOCX Export", f"✅ DOCX file saved with styles:\n{output_docx}")

        except Exception as e:
            QMessageBox.warning(self, "DOCX Export", f"❌ Greška pri konverziji ili primeni stilova:\n{e}")

    # --- About ---

    def show_about(self):
        about_text = (
            "<h3>🧩 About HTML Editor</h3>"
            "<p><b>HTML Editor</b> is a versatile desktop application designed for editing, previewing, and exporting HTML documents.</p>"
            "<p>It provides a clean dual-view interface with a real-time preview and supports advanced content such as:</p>"
            "<ul>"
            "<li>✅ MathJax / LaTeX mathematical formulas</li>"
            "<li>✅ WordPress articles with text, images, and code blocks</li>"
            "<li>✅ EnlighterJS code highlighting</li>"
            "<li>✅ Export to <b>PDF</b>, <b>DOCX</b>, and <b>ODT</b> formats</li>"
            "</ul>"
            "<p>The application is optimized for scientific, technical, and academic publishing — ensuring perfect preservation of mathematical expressions and code formatting during conversion and printing.</p>"
            "<p><b>Author:</b> Aleksandar Maričić<br>"
            "<b>Version:</b> 1.0<br>"
            "<b>License:</b> Free for personal and academic use<br>"
            "<b>Platform:</b> Python + PyQt5</p>"
        )
        QMessageBox.information(self, "About HTML Editor", about_text)
# ------------------- Main -------------------
if __name__ == "__main__":
    app = QApplication(sys.argv)
    editor = HTMLEditor()
    editor.show()
    sys.exit(app.exec_())

MIT License

Copyright (c) [2025] [Aleksandar Maričić]

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE. 

Komplet program sa instalacijskim skriptama i uputstvom možete preuzeti sa ovog linka:

https://abel.rs/wp-content/uploads/2025/10/htmleditor-za-linux-alat-za-uredjivanje-pregled-i-konverziju-wordpress-sadrzaja-u-docx-i-pdf-1.zip

By Abel

2 thoughts on “HTMLEditor za Linux: alat za uređivanje, pregled i konverziju WordPress sadržaja u DOCX i PDF”
  1. Ovaj alat za HTML uređivanje je kao superjunak koji jednim klikom spašava nas od zaveri nekoliko softverskih alata – prvo instalira sve neophodno, pa čak i prevrne greške u formatiranju kao da nisu ni bili. Ali, najbolje je kada jednostavno kopiraš URL i program slijedi putnika preko interneta, uključujući slike i sve ostalo kao da su ti rođeni brat. Zaista, kada radimo sa Pythonom i PyQt5, čini se kao da igramo u prošlosti, ali ovo je pravi moderni alat koji radi posao na način na koji nismo ni svi mogli da zamislimo. Skripta je kao knjiga pravih prijatelja – često previše detaljna, ali ipak korisna.tải video YouTube

  2. Ovo je brežuljak za HTML editora sa šarmom! 😄 Od pip instalacije do MathJax-a, program je zaista zahtevan, kao i mene kada sam pokušavao da ga sastavim. Pogotovo kada sam morao da instaliram opcionalne fontove – kao da ne možete raditi ništa bez da ste oblikovani kao kod Google Font-a. 😅 Na kraju, kada ga uradimo, možemo da izgradimo PDF fajlove koji izgledaju kao da su kreirani uz pomoć nekoliko šibica. A ako program ne radi, samo proverite da li koristite sistemski Python – onda ćete verovatno morati da ga preuzmete iz nekog drugog sveta. 🚀app đếm ngược ngày sinh nhật

Leave a Reply to đếm ngược ngày thi Cancel reply

Your email address will not be published. Required fields are marked *