HTML Editor — Napredni alat za uređivanje, konverziju i prikaz HTML dokumenata
Autor: Aleksandar Maričić
Verzija: 1.0
Platforma: Python + PyQt5
Licenca: Besplatno za ličnu i akademsku upotrebu
Instalacija potrebnih fontova: cambria-math.ttf, courbd.ttf, courbi.ttf, couri.ttf i cour.ttf
HTML Editor (htmleditor.py) je moćan i sveobuhvatan desktop program namenjen uređivanju, prikazu i konverziji HTML dokumenata sa bogatim sadržajem — uključujući matematičke formule (MathJax / LaTeX), tabele, slike i programski kod.
Program je razvijen u Pythonu koristeći biblioteku PyQt5, sa modernim grafičkim interfejsom koji spaja jednostavnost rada sa profesionalnim mogućnostima namenjenim autorima naučnih i tehničkih tekstova.
Ključne mogućnosti
- Uređivanje HTML fajlova — tekstualni editor sa bojanjem HTML sintakse (tagovi, atribute, vrednosti, komentari, CSS i JavaScript).
- Pregled u realnom vremenu — automatski prikaz rezultata u desnom prozoru sa punom podrškom za MathJax formule, slike i stilove.
- Uvoz članaka sa WordPress sajtova — unosom URL adrese članak se automatski preuzima, čisti od nepotrebnih delova i lokalno čuva sa svim slikama i formatiranjem.
- Automatsko preuzimanje slika — sve slike sa veb stranice se lokalno snimaju u poseban folder i pravilno povezuju u HTML fajl.
- MathJax / LaTeX podrška — inline i display formule se pravilno prikazuju i eksportuju u sve formate (PDF, ODT, DOCX).
- Napredna pretraga — alat za pretragu teksta u HTML kodu sa navigacijom (Next / Prev) i isticanjem rezultata u boji.
- Izvoz u više formata — dokumenti se mogu sačuvati kao:
- PDF — vizuelno identičan prikaz za štampu
- DOCX — kompatibilan sa Microsoft Word
- ODT — kompatibilan sa LibreOffice / OpenOffice Writer
- EnlighterJS i programski kod — čuvanje sintaksnog formatiranja kod primera u HTML-u (korisno za programerske i tehničke članke).
Kako funkcioniše
Program kombinuje dve sinhronizovane oblasti:
- Levi panel — tekstualni editor sa bojama i numeracijom linija, pogodan za direktno uređivanje HTML koda.
- Desni panel — dinamički prikaz sadržaja (renderovan pomoću QWebEngineView) sa punom podrškom za CSS, slike i MathJax formule.
Svaka izmena u kodu automatski se reflektuje u prikazu nakon nekoliko sekundi, čime se omogućava instant preview bez potrebe za ručnim osvežavanjem.
Uvoz članaka sa WordPress sajtova
Jedna od najkorisnijih funkcija programa je automatski uvoz i čišćenje članaka sa WordPress sajtova. Dovoljno je uneti URL članka, a program će:
- preuzeti HTML sadržaj sa interneta,
- izdvojiti glavni deo teksta (naslov, slike, tabele, formule),
- obrisati nepotrebne navigacione i reklamne delove,
- preuzeti i lokalno sačuvati sve slike,
- dodati CSS stilove i MathJax skripte,
- formatirati članak za dalje uređivanje i konverziju.
Rezultat je čist, kompaktan i samostalan HTML fajl spreman za objavljivanje, arhiviranje ili konverziju u druge formate.
Konverzija i eksport
HTML Editor podržava direktan izvoz u PDF, ODT i DOCX formate. Tokom konverzije, MathJax formule se automatski renderuju u pretraživaču (Playwright), zatim se konvertovani HTML prosleđuje programu Pandoc koji generiše željeni izlazni dokument.
Ova dvoslojna obrada omogućava besprekorno čuvanje matematičkih izraza i formata čak i u Word i LibreOffice fajlovima.
Pretraga i navigacija
Funkcija Search omogućava pretragu bilo kog teksta u HTML kodu, uz isticanje rezultata žutom bojom i navigaciju kroz pojave pomoću tastera Next i Prev. Trenutni rezultat je dodatno označen narandžasto, a pregled (preview) automatski se pomera na odgovarajući deo teksta.
Upotreba i primene
HTML Editor je idealan alat za:
- nastavnike, istraživače i autore naučnih članaka sa formulama,
- programere i tehničke pisce koji dokumentuju kod,
- blogere koji žele da preuzmu, urede i štampaju WordPress tekstove,
- studente koji pripremaju radove u Word, LibreOffice ili PDF formatu.
Program kombinuje funkcionalnost uređivača teksta, preglednika, konvertera i preuzimača sadržaja — sve u jednoj jednostavnoj aplikaciji.
Tehničke karakteristike
- Jezik: Python 3
- GUI: PyQt5
- Render: QWebEngineView (Chromium)
- Konverzija: Pandoc + Playwright
- Math: MathJax 3
- Licenca: Besplatno za ličnu i akademsku upotrebu
Zaključak
htmleditor.py je kompletan alat za obradu i konverziju HTML dokumenata sa punom podrškom za slike, kod, formule i tabele. Omogućava autorima i istraživačima da efikasno uvoze sadržaj sa WordPress-a, uređuju ga lokalno i eksportuju u profesionalne formate bez gubitka kvaliteta.
Jednostavan interfejs, stabilan rad i podrška za matematičke izraze čine ga savršenim izborom za svakog korisnika koji želi da poveže veb sadržaj, naučne formule i tekstualnu obradu u jedinstvenom radnom okruženju.
HTML Editor — Nauka, tehnologija i sadržaj u savršenoj harmoniji.
INSTALACIJA:
pandoc za odgovarajuci OS preuzmi i instaliraj sa stranice:
https://github.com/jgm/pandoc/releases
1. Instalaciona skripta za Linux install_htmleditor.sh
#!/bin/bash
# install_htmleditor.sh
# Skripta za instalaciju svih sistemskih zavisnosti za htmleditor.py (bez venv-a)
set -e # prekini na prvoj grešci
LOGFILE="install_global_log.txt"
echo "=========================================="
echo " Globalna instalacija htmleditor.py zavisnosti"
echo "=========================================="
# --- Proveri Python ---
if ! command -v python3 &> /dev/null; then
echo "❌ Python3 nije instaliran. Instaliraj ga prvo."
exit 1
fi
# --- Proveri pip ---
if ! command -v pip3 &> /dev/null; then
echo "📦 Instaliram pip..."
if [ -f /etc/debian_version ]; then
sudo apt update -y && sudo apt install -y python3-pip
elif [ -f /etc/fedora-release ]; then
sudo dnf install -y python3-pip
elif [ -f /etc/arch-release ]; then
sudo pacman -Sy --noconfirm python-pip
else
echo "⚠️ Nepoznata distribucija. Instaliraj pip ručno."
exit 1
fi
else
echo "✅ pip3 je već instaliran."
fi
# --- Ažuriraj pip ---
echo "🔄 Ažuriranje pip-a..."
sudo pip3 install --upgrade pip setuptools wheel &>> "$LOGFILE"
# --- Instaliraj Python pakete globalno ---
echo "📦 Instalacija Python paketa..."
sudo pip3 install --upgrade PyQt5 PyQtWebEngine requests beautifulsoup4 lxml reportlab playwright python-docx &>> "$LOGFILE"
# --- Instaliraj Playwright Chromium browser-e ---
echo "🌐 Instalacija Playwright browser-a..."
python3 -m playwright install chromium &>> "$LOGFILE"
# --- Proveri pandoc ---
if ! command -v pandoc &> /dev/null; then
echo "📄 Pandoc nije pronađen. Instalacija pandoc-a..."
if [ "$(uname)" == "Darwin" ]; then
if ! command -v brew &> /dev/null; then
echo "⚠️ Homebrew nije instaliran. Instaliraj ga ovako:"
echo '/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"'
exit 1
fi
brew install pandoc
elif [ -f /etc/debian_version ]; then
sudo apt update -y && sudo apt install -y pandoc
elif [ -f /etc/fedora-release ]; then
sudo dnf install -y pandoc
elif [ -f /etc/arch-release ]; then
sudo pacman -Sy --noconfirm pandoc
else
echo "⚠️ Nepoznata distribucija. Molimo instalirajte pandoc ručno:"
echo "👉 https://pandoc.org/installing.html"
fi
else
echo "✅ Pandoc je već instaliran."
fi
# --- Proveri htmleditor.py ---
if [ ! -f "htmleditor.py" ]; then
echo "❌ Datoteka 'htmleditor.py' nije pronađena u trenutnom direktorijumu!"
exit 1
fi
echo "------------------------------------------"
echo "🚀 Pokrećem htmleditor.py globalno..."
python3 htmleditor.py
echo "✅ Završeno! (detaljan log: $LOGFILE)"
Sadržaj sačuvaj u fajl install_htmleditor.sh i pokreni ga sa:
chmod +x install_htmleditor.sh ./install_htmleditor.sh
2. Instalaciona skripta za Windows Virtuelno okruženje
Evo kompletne skripte za Windows koja automatski kreira virtualno okruženje, instalira sve potrebne Python pakete, Playwright + Chromium, provjerava pandoc i na kraju pokreće htmleditor.py.
Skirpta se startuje dvoklikom na run_htmleditor_in_venv.bat
@ecno off
REM ===============================
REM run_htmleditor.bat
REM Skripta za setup i pokretanje htmleditor.py u virtualnom okruženju na Windows 10
REM ===============================
REM --- Podešavanje promenljivih ---
set VENV_DIR=venv
set LOGFILE=install_log.txt
echo ==========================================
echo Postavljanje virtualnog okruženja za htmleditor.py
echo ==========================================
REM --- Proveri Python ---
where python >nul 2>&1
if %ERRORLEVEL% neq 0 (
echo Python nije pronadjen. Instaliraj ga prvo.
exit /b 1
)
REM --- Proveri htmleditor.py ---
if not exist htmleditor.py (
echo Datoteka 'htmleditor.py' nije pronadjena u trenutnom direktorijumu!
exit /b 1
)
REM --- Kreiraj virtualno okruženje ---
if not exist "%VENV_DIR%" (
echo Kreiranje virtualnog okruzenja: %VENV_DIR%
python -m venv "%VENV_DIR%"
) else (
echo Virtualno okruženje već postoji: %VENV_DIR%
)
REM --- Aktivacija venv-a ---
echo Aktivacija virtualnog okruzenja...
call "%VENV_DIR%\Scripts\activate.bat"
REM --- Ažuriraj pip ---
echo Azuriranje pip-a...
pip install --upgrade pip setuptools wheel >> "%LOGFILE%" 2>&1
REM --- Instaliraj Python pakete ---
echo Provera Instalacija Python paketa...
pip install --upgrade PyQt5 PyQtWebEngine requests beautifulsoup4 lxml reportlab playwright python-docx >> "%LOGFILE%" 2>&1
REM --- Instaliraj Playwright Chromium browser-e ---
echo Provera Instalacija Playwright browser-a...
python -m playwright install chromium >> "%LOGFILE%" 2>&1
REM --- Proveri pandoc ---
where pandoc >nul 2>&1
if %ERRORLEVEL% neq 0 (
echo Pandoc nije pronađen. Molimo instalirajte ga rucno:
echo https://pandoc.org/installing.html
) else (
echo Pandoc je vec instaliran.
)
echo ------------------------------------------
echo Pokrecem htmleditor.py...
python htmleditor.py
REM --- Deaktivacija venv-a ---
deactivate
echo Zavrseno! (detaljan log: %LOGFILE%)
pause
3. (Preporučeno) srkripta za Linux/macOS Virtuelno okruženje
Preporučeno je da koristiš virtuelno okruženje:
Evo kompletne skripte za Linux/macOS koja automatski kreira virtualno okruženje, instalira sve potrebne Python pakete, Playwright + Chromium, proverava pandoc i na kraju pokreće htmleditor.py.
📝 run_htmleditor_in_venv.sh
#!/bin/bash
# run_htmleditor_in_venvr.sh
# Skripta za setup i pokretanje htmleditor.py u virtualnom okruženju
set -e # prekini na prvoj grešci
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
cd "$SCRIPT_DIR"
VENV_DIR="venv"
LOGFILE="install_log.txt"
echo "=========================================="
echo " Postavljanje virtualnog okruženja za htmleditor.py"
echo "=========================================="
# --- Proveri Python ---
if ! command -v python3 &> /dev/null; then
echo "❌ Python3 nije instaliran. Instaliraj ga prvo."
exit 1
fi
# --- Proveri htmleditor.py ---
if [ ! -f "htmleditor.py" ]; then
echo "❌ Datoteka 'htmleditor.py' nije pronađena u trenutnom direktorijumu!"
exit 1
fi
# --- Kreiraj virtualno okruženje ---
if [ ! -d "$VENV_DIR" ]; then
echo "🔧 Kreiranje virtualnog okruženja: $VENV_DIR"
python3 -m venv "$VENV_DIR"
else
echo "✅ Virtualno okruženje već postoji: $VENV_DIR"
fi
# --- Aktivacija venv-a ---
if [ -z "$VIRTUAL_ENV" ]; then
echo "⚡ Aktivacija virtualnog okruženja..."
source "$VENV_DIR/bin/activate"
else
echo "✅ Virtualno okruženje je već aktivno."
fi
# --- Ažuriraj pip ---
echo "🔄 Ažuriranje pip-a..."
pip install --upgrade pip setuptools wheel &>> "$LOGFILE"
# --- Instaliraj Python pakete ---
echo "📦 Instalacija Python paketa..."
pip install --upgrade PyQt5 PyQtWebEngine requests beautifulsoup4 lxml reportlab playwright python-docx &>> "$LOGFILE"
# --- Instaliraj Playwright Chromium browser-e ---
echo "🌐 Instalacija Playwright browser-a..."
python3 -m playwright install chromium &>> "$LOGFILE"
# --- Proveri pandoc ---
if ! command -v pandoc &> /dev/null; then
echo "📄 Pandoc nije pronađen. Instalacija pandoc-a..."
if [ "$(uname)" == "Darwin" ]; then
if ! command -v brew &> /dev/null; then
echo "⚠️ Homebrew nije instaliran. Instaliraj ga ovako:"
echo '/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"'
exit 1
fi
brew install pandoc
elif [ -f /etc/debian_version ]; then
sudo apt update -y && sudo apt install -y pandoc
elif [ -f /etc/fedora-release ]; then
sudo dnf install -y pandoc
elif [ -f /etc/arch-release ]; then
sudo pacman -Sy --noconfirm pandoc
else
echo "⚠️ Nepoznata distribucija. Molimo instalirajte pandoc ručno:"
echo "👉 https://pandoc.org/installing.html"
fi
else
echo "✅ Pandoc je već instaliran."
fi
echo "------------------------------------------"
echo "🚀 Pokrećem htmleditor.py..."
python3 htmleditor.py
# --- Deaktivacija venv-a na kraju ---
deactivate
echo "✅ Završeno! (detaljan log: $LOGFILE)"
Programski kod za htmleditor.py
#htmleditor.py
# instalacija potrebnih zavisnosti: pip install requests PyQt5 PyQtWebEngine beautifulsoup4 playwright
# Nakon toga, inicijalizuj Playwright (pre nego što ga koristiš) komandnom linijom:
# playwright install
#!/usr/bin/env python3
import sys
import os
import requests
import subprocess
import asyncio
import glob
import shutil
from PyQt5.QtWidgets import (
QApplication, QWidget, QSplitter, QVBoxLayout, QHBoxLayout,
QTextEdit, QPushButton, QFileDialog, QLineEdit, QMessageBox, QDialog
)
from PyQt5.QtCore import Qt, QTimer, QUrl, QRegExp
from PyQt5.QtGui import QFont, QSyntaxHighlighter, QTextCharFormat, QColor, QTextCursor
from PyQt5.QtWebEngineWidgets import QWebEngineView
from bs4 import BeautifulSoup
from urllib.parse import urlparse, urljoin
from playwright.async_api import async_playwright
# ------------------- HTML Download & Clean funkcije -------------------
def download_html(url, output_file):
try:
print(f"📥 Preuzimam: {url}")
response = requests.get(url)
response.raise_for_status()
with open(output_file, 'w', encoding='utf-8') as f:
f.write(response.text)
print(f"✅ Stranica je sačuvana kao {output_file}")
except requests.exceptions.RequestException as e:
print(f"❌ Greška prilikom preuzimanja stranice: {e}")
raise
def download_image(img_url, folder, html_file):
try:
os.makedirs(folder, exist_ok=True)
local_filename = os.path.join(folder, os.path.basename(urlparse(img_url).path))
response = requests.get(img_url, stream=True)
response.raise_for_status()
with open(local_filename, 'wb') as f:
for chunk in response.iter_content(1024):
f.write(chunk)
return os.path.relpath(local_filename, os.path.dirname(html_file))
except Exception as e:
print(f"❌ Greška pri preuzimanju slike {img_url}: {e}")
return img_url
def get_featured_image(soup, base_url, html_file):
div_thumb = soup.find('div', class_='bs-blog-thumb')
if div_thumb:
img_tag = div_thumb.find('img')
if img_tag and img_tag.get('src'):
img_url = urljoin(base_url, img_tag['src'])
# Folder za slike će imati isto ime kao HTML fajl bez ekstenzije
img_folder = os.path.splitext(html_file)[0]
img_tag['src'] = download_image(img_url, img_folder, html_file)
return img_tag
a_thumb = soup.find('a', class_='bs-blog-thumb caption')
if a_thumb:
img_tag = a_thumb.find('img')
if img_tag and img_tag.get('src'):
img_url = urljoin(base_url, img_tag['src'])
# Folder za slike će imati isto ime kao HTML fajl bez ekstenzije
img_folder = os.path.splitext(html_file)[0]
img_tag['src'] = download_image(img_url, img_folder, html_file)
return img_tag
return None
def clean_html(html_file, base_url):
with open(html_file, 'r', encoding='utf-8') as f:
soup = BeautifulSoup(f, 'html.parser')
main = soup.find('article') or soup.find('div', class_='entry-content') or soup.find('main') or soup.body
post_nav = main.find(class_='post-navigation')
if post_nav:
for elem in post_nav.find_all_next():
elem.decompose()
post_nav.decompose()
TABLE_CSS = """
<style>
table {border-collapse: collapse; width: 100%; margin: 10px 0;}
th, td {border: 1px solid #333; padding: 6px 8px; text-align: left;}
th {background-color: #f0f0f0;}
img {max-width: 100%; height: auto;}
</style>
"""
MATHJAX_SCRIPT = """
<script>
window.MathJax = {
tex: {
inlineMath: [['$', '$'], ['\\\\(', '\\\\)']],
displayMath: [['$$','$$'], ['\\\\[','\\\\]']]
}
};
</script>
<script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
"""
# Folder za slike će imati isto ime kao HTML fajl bez ekstenzije
img_folder = os.path.splitext(html_file)[0]
os.makedirs(img_folder, exist_ok=True)
featured_img_tag = get_featured_image(soup, base_url, html_file)
for img in main.find_all('img'):
src = img.get('src')
if not src:
continue
img_url = urljoin(base_url, src)
img['src'] = download_image(img_url, img_folder, html_file)
naslov_tag = soup.new_tag('h1')
naslov_tag.string = soup.title.string if soup.title else 'Naslov stranice'
naslov_tag['style'] = 'text-align:center;font-size:28px;color:black;'
main.insert(0, naslov_tag)
if featured_img_tag:
naslov_tag.insert_after(featured_img_tag)
html_str = f"<html><head>{TABLE_CSS}{MATHJAX_SCRIPT}</head><body>{str(main)}</body></html>"
with open(html_file, 'w', encoding='utf-8') as f:
f.write(html_str)
print(f"✅ Očišćen HTML sačuvan u: {html_file}")
return html_file
# ------------------- CodeExtractor Funkcionalnost -------------------
class CodeExtractor:
@staticmethod
def get_language(block):
"""Vrati ekstenziju na osnovu jezika u EnlighterJS atributima"""
lang = block.get("data-enlighter-language", "").strip().lower()
if not lang:
# ako nema data-enlighter-language, pokušaj iz class atributa
classes = block.get("class", [])
for c in classes:
c = c.lower()
if c in ["python", "html", "css", "javascript", "bash", "cpp"]:
lang = c
break
# mapiranje jezika → ekstenzije
ext_map = {
"python": "py",
"html": "html",
"css": "css",
"javascript": "js",
"bash": "sh",
"cpp": "cpp",
"c++": "cpp",
"json": "json",
"xml": "xml",
"php": "php",
}
return ext_map.get(lang, "txt")
@staticmethod
def extract_code_from_html(html_content, output_folder, base_name):
"""Izdvoji sve EnlighterJS blokove i sačuvaj ih direktno u output_folder"""
soup = BeautifulSoup(html_content, 'html.parser')
# Pronađi sve varijante EnlighterJS blokova
code_blocks = soup.find_all("pre", class_=["EnlighterJS", "enlighterjs", "EnlighterJSRAW"])
if not code_blocks:
return 0, "⚠️ Nema pronađenih EnlighterJS kod blokova."
count = 0
extracted_files = []
for i, block in enumerate(code_blocks, start=1):
code_text = block.get_text().strip()
if not code_text:
continue
# Proveri prvu liniju — ako počinje sa # i ima tačku (ime fajla)
first_line = code_text.splitlines()[0].strip()
if first_line.startswith("#") and "." in first_line:
file_name = first_line.lstrip("#").strip()
else:
# Ako nema, koristi ekstenziju iz EnlighterJS jezika
ext = CodeExtractor.get_language(block)
file_name = f"program{str(i).zfill(2)}.{ext}"
file_path = os.path.join(output_folder, file_name)
# Zapiši kod
with open(file_path, "w", encoding="utf-8") as f:
f.write(code_text)
count += 1
extracted_files.append(file_name)
print(f"💾 Snimljen kod: {file_path}")
message = f"✅ Ukupno izdvojeno {count} EnlighterJS blokova u folder: {output_folder}"
if extracted_files:
message += f"\n\nGenerisani fajlovi:\n" + "\n".join(f"• {f}" for f in extracted_files)
return count, message
@staticmethod
def extract_code_from_url(url, output_folder):
"""Preuzmi HTML sa URL-a i izdvoji kodove direktno u output_folder"""
try:
print(f"📥 Preuzimam: {url}")
response = requests.get(url)
response.raise_for_status()
html_content = response.text
parsed = urlparse(url)
base_name = parsed.path.strip("/").split("/")[-1] or "wordpress_stranica"
return CodeExtractor.extract_code_from_html(html_content, output_folder, base_name)
except requests.exceptions.RequestException as e:
return 0, f"❌ Greška pri preuzimanju: {e}"
# ------------------- HTML Syntax Highlighting -------------------
class HTMLHighlighter(QSyntaxHighlighter):
def __init__(self, parent=None):
super().__init__(parent)
self.highlightingRules = []
tagFormat = QTextCharFormat()
tagFormat.setForeground(QColor("blue"))
tagPattern = QRegExp("</?[a-zA-Z][^>]*>")
self.highlightingRules.append((tagPattern, tagFormat))
attrFormat = QTextCharFormat()
attrFormat.setForeground(QColor("darkred"))
attrPattern = QRegExp("\\b[a-zA-Z-:]+(?=\\=)")
self.highlightingRules.append((attrPattern, attrFormat))
valueFormat = QTextCharFormat()
valueFormat.setForeground(QColor("darkgreen"))
valuePattern = QRegExp("\".*?\"")
self.highlightingRules.append((valuePattern, valueFormat))
commentFormat = QTextCharFormat()
commentFormat.setForeground(QColor("gray"))
commentFormat.setFontItalic(True)
commentPattern = QRegExp("<!--[^>]*-->")
self.highlightingRules.append((commentPattern, commentFormat))
cssFormat = QTextCharFormat()
cssFormat.setForeground(QColor("darkmagenta"))
cssPattern = QRegExp("(?<=<style.*?>).*?(?=</style>)")
cssPattern.setMinimal(True)
self.highlightingRules.append((cssPattern, cssFormat))
jsFormat = QTextCharFormat()
jsFormat.setForeground(QColor("darkcyan"))
jsPattern = QRegExp("(?<=<script.*?>).*?(?=</script>)")
jsPattern.setMinimal(True)
self.highlightingRules.append((jsPattern, jsFormat))
def highlightBlock(self, text):
for pattern, fmt in self.highlightingRules:
index = pattern.indexIn(text)
while index >= 0:
length = pattern.matchedLength()
self.setFormat(index, length, fmt)
index = pattern.indexIn(text, index + length)
# ------------------- PDF Preview Dialog -------------------
class PDFPreviewDialog(QDialog):
def __init__(self, html_content, base_path, parent=None):
super().__init__(parent)
self.setWindowTitle("PDF Preview")
self.resize(800, 1100)
self.base_path = base_path
self.html_content = html_content
self.parent_editor = parent
layout = QVBoxLayout(self)
self.preview = QWebEngineView()
layout.addWidget(self.preview)
btn_save_pdf = QPushButton("💾 Save PDF")
layout.addWidget(btn_save_pdf)
btn_save_pdf.clicked.connect(self.save_pdf)
base_tag = f'<base href="file://{self.base_path}/">'
full_html = f"""
<html>
<head>
{base_tag}
<meta charset="utf-8">
<style>
@page {{ size: A4; margin: 2cm; }}
body {{ font-family: Arial, sans-serif; padding: 0; }}
img {{ max-width: 100%; height: auto; }}
pre, code {{ white-space: pre-wrap !important; word-wrap: break-word !important; overflow-x: auto; font-family: "Courier-PS", "Courier", "Courier New", monospace; font-size: 11pt; line-height: 1.3; background-color: #f4f4f4; border-radius: 4px; padding: 8px; border: 1px solid #ddd;}}
table, th, td {{ border: 1px solid #777; border-collapse: collapse; padding: 5px; }}
th {{ background-color: #ddd; }}
</style>
<script>
window.MathJax = {{
tex: {{
inlineMath: [['$', '$'], ['\\\\(', '\\\\)']],
displayMath: [['$$','$$'], ['\\\\[','\\\\]']]
}}
}};
</script>
<script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
</head>
<body>{self.html_content}</body>
</html>
"""
self.preview.setHtml(full_html, QUrl.fromLocalFile(self.base_path + "/"))
def save_pdf(self):
# Koristi folder trenutno učitanog HTML fajla kao podrazumevanu lokaciju
if self.parent_editor and hasattr(self.parent_editor, 'images_folder') and self.parent_editor.images_folder:
base_name = os.path.splitext(os.path.basename(self.parent_editor.filename))[0] if self.parent_editor.filename else "document"
default_name = os.path.join(self.parent_editor.images_folder, f"{base_name}.pdf")
else:
default_name = "document.pdf"
pdf_file, _ = QFileDialog.getSaveFileName(self, "Save PDF", default_name, "PDF Files (*.pdf)")
if not pdf_file:
return
if not pdf_file.lower().endswith(".pdf"):
pdf_file += ".pdf"
pdf_file = os.path.abspath(pdf_file)
def pdf_callback(pdf_bytes):
if pdf_bytes:
with open(pdf_file, "wb") as f:
f.write(pdf_bytes)
QMessageBox.information(self, "PDF Export", f"PDF sačuvan:\n{pdf_file}")
else:
QMessageBox.warning(self, "PDF Export", "PDF nije generisan ili je prazan.")
self.accept()
self.preview.page().printToPdf(pdf_callback)
# ------------------- Glavni HTML Editor -------------------
class HTMLEditor(QWidget):
def __init__(self):
super().__init__()
self.setWindowTitle("HTML Editor + WordPress Import + PDF + DOCX + CodeExtractor")
self.resize(1600, 900)
self.filename = None
self.base_path = os.getcwd()
self.images_folder = None # Folder gde su sačuvane slike
self.search_results = []
self.current_search_index = -1
self.html_files_to_cleanup = [] # Lista HTML fajlova za brisanje pri zatvaranju
layout = QVBoxLayout(self)
# Toolbar
toolbar = QHBoxLayout()
btn_open = QPushButton("📂 Open")
btn_save = QPushButton("💾 Save")
btn_save_as = QPushButton("💾 Save As")
btn_pdf = QPushButton("📄 Save as PDF")
btn_docx = QPushButton("📄 Save as DOCX")
btn_code_extractor = QPushButton("🔧 CodeExtractor")
btn_about = QPushButton("About HTML Editor")
toolbar.addWidget(btn_open)
toolbar.addWidget(btn_save)
toolbar.addWidget(btn_save_as)
toolbar.addWidget(btn_pdf)
toolbar.addWidget(btn_docx)
toolbar.addWidget(btn_code_extractor)
toolbar.addWidget(btn_about)
toolbar.addStretch()
# --- Search bar ---
self.search_button = QPushButton("Search")
self.search_input = QLineEdit()
self.search_input.setPlaceholderText("Search in HTML...")
self.search_next = QPushButton("Next")
self.search_prev = QPushButton("Prev")
toolbar.addWidget(self.search_button)
toolbar.addWidget(self.search_input)
toolbar.addWidget(self.search_next)
toolbar.addWidget(self.search_prev)
self.search_button.clicked.connect(self.search_text)
self.search_input.returnPressed.connect(self.search_text)
self.search_next.clicked.connect(lambda: self.navigate_search(1))
self.search_prev.clicked.connect(lambda: self.navigate_search(-1))
# WordPress URL input + dugme
self.wp_button = QPushButton("Učitaj URL")
self.wp_input = QLineEdit()
self.wp_input.setPlaceholderText("Enter WordPress article URL")
toolbar.addWidget(self.wp_button)
toolbar.addWidget(self.wp_input)
layout.addLayout(toolbar)
# Splitter
splitter = QSplitter(Qt.Horizontal)
self.editor = QTextEdit()
self.editor.setFont(QFont("Consolas", 12))
self.editor.setLineWrapMode(QTextEdit.WidgetWidth)
splitter.addWidget(self.editor)
self.preview = QWebEngineView()
splitter.addWidget(self.preview)
splitter.setSizes([800, 800])
layout.addWidget(splitter)
# Syntax highlighting
self.highlighter = HTMLHighlighter(self.editor.document())
# Buttons
btn_open.clicked.connect(self.open_file)
btn_save.clicked.connect(self.save_file)
btn_save_as.clicked.connect(self.save_file_as)
btn_pdf.clicked.connect(self.save_pdf)
btn_docx.clicked.connect(self.save_docx)
btn_code_extractor.clicked.connect(self.run_code_extractor)
btn_about.clicked.connect(self.show_about)
self.wp_button.clicked.connect(self.load_wordpress_url)
self.wp_input.returnPressed.connect(self.load_wordpress_url)
# Live preview
self.editor.textChanged.connect(self.schedule_preview)
self.preview_timer = QTimer()
self.preview_timer.setSingleShot(True)
self.preview_timer.timeout.connect(self.update_preview)
def get_default_export_path(self, extension):
"""Vrati podrazumevanu putanju za export baziranu na trenutnom HTML fajlu"""
if self.images_folder and self.filename:
base_name = os.path.splitext(os.path.basename(self.filename))[0]
return os.path.join(self.images_folder, f"{base_name}.{extension}")
elif self.filename:
base_name = os.path.splitext(os.path.basename(self.filename))[0]
default_dir = os.path.dirname(self.filename)
return os.path.join(default_dir, f"{base_name}.{extension}")
return f"document.{extension}"
def get_current_folder(self):
"""Vrati folder trenutno učitanog HTML fajla (folder sa slikama)"""
if self.images_folder:
return self.images_folder
elif self.filename:
return os.path.dirname(self.filename)
return os.getcwd()
def cleanup_html_files(self):
"""Obriši sve HTML fajlove u folderu gde se nalazi htmleditor"""
try:
current_dir = os.getcwd()
html_files = glob.glob(os.path.join(current_dir, "*.html"))
for html_file in html_files:
try:
os.remove(html_file)
print(f"🧹 Obrisan HTML fajl: {html_file}")
except Exception as e:
print(f"⚠️ Ne mogu da obrišem {html_file}: {e}")
# Takođe obriši _mathml.html fajlove
mathml_files = glob.glob(os.path.join(current_dir, "*_mathml.html"))
for mathml_file in mathml_files:
try:
os.remove(mathml_file)
print(f"🧹 Obrisan MathML fajl: {mathml_file}")
except Exception as e:
print(f"⚠️ Ne mogu da obrišem {mathml_file}: {e}")
except Exception as e:
print(f"❌ Greška pri čišćenju HTML fajlova: {e}")
def closeEvent(self, event):
"""Poziva se pri zatvaranju aplikacije"""
self.cleanup_html_files()
event.accept()
# --- CodeExtractor funkcija ---
def run_code_extractor(self):
"""Pokreće CodeExtractor sa URL-om koji je učitan u editor i čuva kod u folderu trenutne stranice"""
url = self.wp_input.text().strip()
if not url:
QMessageBox.warning(self, "CodeExtractor", "Nema URL adrese u polju za WordPress URL!")
return
# Koristi folder trenutno učitanog HTML fajla (folder sa slikama) kao podrazumevanu lokaciju
default_dir = self.get_current_folder()
# Otvori dijalog za odabir foldera sa podrazumevanim folderom trenutne stranice
output_dir = QFileDialog.getExistingDirectory(
self,
"Odaberite folder za CodeExtractor izlaz",
default_dir,
QFileDialog.ShowDirsOnly
)
if not output_dir: # Korisnik je otkazao
return
try:
# Pokreni CodeExtractor - kod će biti sačuvan direktno u izabranom folderu
count, message = CodeExtractor.extract_code_from_url(url, output_dir)
if count > 0:
QMessageBox.information(self, "CodeExtractor", message)
else:
QMessageBox.warning(self, "CodeExtractor", message)
except Exception as e:
QMessageBox.warning(
self,
"CodeExtractor",
f"❌ Greška pri pokretanju CodeExtractor-a:\n\n{e}"
)
# --- WordPress Import ---
def load_wordpress_url(self):
url = self.wp_input.text().strip()
if not url:
return
try:
parsed = urlparse(url)
base_name = parsed.path.strip("/").split("/")[-1] or "wordpress_stranica"
local_file = os.path.abspath(f"{base_name}.html")
download_html(url, local_file)
clean_html(local_file, url)
with open(local_file, "r", encoding="utf-8") as f:
content = f.read()
self.editor.setPlainText(content)
self.filename = local_file
# Folder sa slikama će imati isto ime kao HTML fajl bez ekstenzije
self.images_folder = os.path.splitext(local_file)[0]
self.base_path = os.path.dirname(local_file)
self.setWindowTitle(f"HTML Editor — {os.path.basename(local_file)}")
self.update_preview()
# Dodaj u listu za čišćenje
if local_file not in self.html_files_to_cleanup:
self.html_files_to_cleanup.append(local_file)
except Exception as e:
QMessageBox.warning(self, "WordPress Import", f"Error: {e}")
# --- File operations ---
def open_file(self):
# Koristi folder trenutno učitanog HTML fajla (folder sa slikama) kao podrazumevanu lokaciju
default_dir = self.get_current_folder()
filename, _ = QFileDialog.getOpenFileName(self, "Open HTML", default_dir, "HTML Files (*.html *.htm)")
if filename:
with open(filename, "r", encoding="utf-8") as f:
content = f.read()
self.editor.setPlainText(content)
self.filename = filename
self.base_path = os.path.dirname(os.path.abspath(filename))
self.setWindowTitle(f"HTML Editor — {os.path.basename(filename)}")
self.update_preview()
def save_file(self):
if not self.filename:
self.save_file_as()
return
with open(self.filename, "w", encoding="utf-8") as f:
f.write(self.editor.toPlainText())
QMessageBox.information(self, "Saved", f"✅ File saved: {self.filename}")
def save_file_as(self):
# Koristi trenutno ime fajla i folder (folder sa slikama) kao predlog
if self.filename:
default_name = os.path.basename(self.filename)
default_dir = self.get_current_folder()
else:
default_name = "document.html"
default_dir = os.getcwd()
filename, _ = QFileDialog.getSaveFileName(self, "Save HTML",
os.path.join(default_dir, default_name),
"HTML Files (*.html *.htm)")
if filename:
self.filename = filename
self.save_file()
self.setWindowTitle(f"HTML Editor — {os.path.basename(filename)}")
# --- Search ---
def search_text(self):
text = self.search_input.text()
if not text:
return
self.search_results = []
cursor = self.editor.textCursor()
cursor.movePosition(QTextCursor.Start)
self.editor.setTextCursor(cursor)
while self.editor.find(text):
cursor = self.editor.textCursor()
self.search_results.append(cursor.selectionStart())
if self.search_results:
self.current_search_index = 0
self.highlight_search_result()
else:
QMessageBox.information(self, "Search", f"No matches for '{text}' found.")
def highlight_search_result(self):
if 0 <= self.current_search_index < len(self.search_results):
pos = self.search_results[self.current_search_index]
cursor = self.editor.textCursor()
cursor.setPosition(pos)
cursor.movePosition(QTextCursor.Right, QTextCursor.KeepAnchor, len(self.search_input.text()))
self.editor.setTextCursor(cursor)
def navigate_search(self, step):
if not self.search_results:
return
self.current_search_index = (self.current_search_index + step) % len(self.search_results)
self.highlight_search_result()
# --- Live preview ---
def schedule_preview(self):
self.preview_timer.start(400)
def update_preview(self):
html_content = self.editor.toPlainText()
self.preview.setHtml(html_content, QUrl.fromLocalFile(self.base_path + "/"))
# --- MathJax + temporary HTML for Pandoc ---
def update_mathml_html(self):
if not self.filename:
return None
base_name = os.path.splitext(self.filename)[0]
mathml_file = base_name + "_mathml.html"
with open(mathml_file, "w", encoding="utf-8") as f:
f.write(self.editor.toPlainText())
return mathml_file
async def render_mathjax_and_write(self, html_content, output_html):
"""
Učitaj html_content u headless browser, čekaj MathJax da renderuje,
zatim snimi renderovani HTML u output_html.
"""
MATHJAX_SCRIPT = """
<script type="text/javascript">
window.MathJax = {
tex: {
inlineMath: [['$', '$'], ['\\\\(', '\\\\)']],
displayMath: [['$$','$$'], ['\\\\[','\\\\]']]
},
options: { skipHtmlTags: ['noscript','style','textarea'] }
};
</script>
<script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
"""
# Ensure MathJax script is present in the HTML we load
if "<head>" in html_content.lower():
html_to_load = html_content.replace("<head>", "<head>\n" + MATHJAX_SCRIPT, 1)
else:
html_to_load = MATHJAX_SCRIPT + html_content
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
await page.set_content(html_to_load, wait_until="load")
# Wait for MathJax to be available and typeset
try:
await page.wait_for_function("() => window.MathJax !== undefined", timeout=5000)
await page.evaluate("() => MathJax.typesetPromise()")
except Exception:
# Even if waiting fails, attempt to proceed
pass
rendered = await page.content()
await browser.close()
with open(output_html, "w", encoding="utf-8") as f:
f.write(rendered)
# --- Save PDF ---
def save_pdf(self):
if not self.filename:
QMessageBox.warning(self, "PDF Export", "Please save HTML file first!")
return
# Direktno otvori PDF preview dijalog kao u prethodnoj verziji
html_content = self.editor.toPlainText()
dlg = PDFPreviewDialog(html_content, self.base_path, self)
dlg.exec_()
# --- Save DOCX (stilovi integrisani) ---
def save_docx(self):
if not self.filename:
QMessageBox.warning(self, "DOCX Export", "Please save HTML file first!")
return
# Koristi putanju trenutnog HTML fajla (folder sa slikama) kao predlog za DOCX
default_path = self.get_default_export_path("docx")
output_docx, _ = QFileDialog.getSaveFileName(self, "Save DOCX", default_path, "DOCX Files (*.docx)")
if not output_docx:
return
mathml_file = self.update_mathml_html()
if not mathml_file:
QMessageBox.warning(self, "DOCX Export", "Cannot create _mathml.html")
return
html_content = self.editor.toPlainText()
try:
asyncio.run(self.render_mathjax_and_write(html_content, mathml_file))
subprocess.run(["pandoc", mathml_file, "-s", "-o", output_docx, "--mathml"], check=True)
# --- Stilovi ---
from docx import Document
from docx.shared import Pt, RGBColor
from docx.enum.style import WD_STYLE_TYPE
doc = Document(output_docx)
styles = doc.styles
paragraph_styles = {
"Title": ("Cambria Math", 20),
"Heading 1": ("Cambria Math", 16),
"Heading 2": ("Cambria Math", 14),
"Heading 3": ("Cambria Math", 14),
"Heading 4": ("Cambria Math", 14),
"Normal": ("Cambria Math", 11),
"Source Code": ("Courier New", 10),
"First Paragraph": ("Cambria Math", 11),
"Body Text": ("Cambria Math", 11),
"Compact": ("Cambria Math", 11)
}
for name, (font, size) in paragraph_styles.items():
for s in styles:
if s.name == name and s.type == WD_STYLE_TYPE.PARAGRAPH:
s.font.name = font
s.font.size = Pt(size)
s.font.color.rgb = RGBColor(0,0,0)
for p in doc.paragraphs:
if p.style.name == "Source Code":
for r in p.runs:
r.font.name = "Courier New"
r.font.size = Pt(10)
r.font.color.rgb = RGBColor(0,0,0)
character_styles = [
"KeywordTok","DataTypeTok","DecValTok","BaseNTok","FloatTok","ConstantTok",
"CharTok","SpecialCharTok","StringTok","VerbatimStringTok","SpecialStringTok",
"ImportTok","CommentTok","DocumentationTok","AnnotationTok","CommentVarTok",
"OtherTok","FunctionTok","VariableTok","ControlFlowTok","OperatorTok","BuiltInTok",
"ExtensionTok","PreprocessorTok","AttributeTok","RegionMarkerTok","InformationTok",
"WarningTok","AlertTok","ErrorTok","NormalTok"
]
for name in character_styles:
for s in styles:
if s.name == name and s.type == WD_STYLE_TYPE.CHARACTER:
s.font.name = "Times New Roman"
s.font.size = Pt(10)
s.font.color.rgb = RGBColor(0,0,0)
doc.save(output_docx)
QMessageBox.information(self, "DOCX Export", f"✅ DOCX file saved with styles:\n{output_docx}")
except Exception as e:
QMessageBox.warning(self, "DOCX Export", f"❌ Greška pri konverziji ili primeni stilova:\n{e}")
# --- About ---
def show_about(self):
about_text = (
"<h3>🧩 About HTML Editor</h3>"
"<p><b>HTML Editor</b> is a versatile desktop application designed for editing, previewing, and exporting HTML documents.</p>"
"<p>It provides a clean dual-view interface with a real-time preview and supports advanced content such as:</p>"
"<ul>"
"<li>✅ MathJax / LaTeX mathematical formulas</li>"
"<li>✅ WordPress articles with text, images, and code blocks</li>"
"<li>✅ EnlighterJS code highlighting</li>"
"<li>✅ Export to <b>PDF</b> and <b>DOCX</b> formats</li>"
"<li>✅ <b>CodeExtractor</b> tool for extracting code from WordPress articles</li>"
"</ul>"
"<p>The application is optimized for scientific, technical, and academic publishing — ensuring perfect preservation of mathematical expressions and code formatting during conversion and printing.</p>"
"<p><b>Author:</b> Aleksandar Maričić<br>"
"<b>Version:</b> 1.0<br>"
"<b>License:</b> Free for personal and academic use<br>"
"<b>Platform:</b> Python + PyQt5</p>"
)
QMessageBox.information(self, "About HTML Editor", about_text)
# ------------------- Main -------------------
if __name__ == "__main__":
app = QApplication(sys.argv)
editor = HTMLEditor()
editor.show()
# Prihvatanje URL-a iz komandne linije
if len(sys.argv) > 1:
url_arg = sys.argv[1]
# Malo zakašnjenje da GUI upali sve pre nego što počne učitavanje
QTimer.singleShot(800, lambda: (editor.wp_input.setText(url_arg), editor.load_wordpress_url()))
# Pokreni čišćenje pri zatvaranju aplikacije
app.aboutToQuit.connect(editor.cleanup_html_files)
sys.exit(app.exec_())
#MIT_License.txt MIT License Copyright (c) [2025] [Aleksandar Maričić] Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Komplet program sa instalacijskim skriptama i uputstvom možete preuzeti sa ovog linka:
