🎧 Python program za pojačavanje i obradu zvuka u MP4 video fajlovima

U situacijama kada je snimak video materijala previše tih, a ponovno snimanje nije moguće, najbrže rešenje je — pojačavanje zvuka u postojećem MP4 fajlu.
Ovaj Python program omogućava upravo to: automatsko pojačavanje audio zapisa unutar video fajla, uz mogućnost dodatne obrade, filtriranja i prilagođavanja kvaliteta zvuka.


🔍 Osnovne karakteristike programa

Program je razvijen u Pythonu 3 i koristi savremene biblioteke za obradu zvuka i prikaz intuitivnog grafičkog interfejsa.
Kombinuje mogućnosti PyQt6 za GUI, FFmpeg za konverziju i spajanje video/audio zapisa, te pydub, pyaudio i scipy za analizu i poboljšanje kvaliteta tona.

Glavne funkcije uključuju:

  • 🎚️ Pojačavanje zvuka u MP4 video fajlovima bez gubitka kvaliteta
  • 🎧 Reprodukcija i pregled audio snimka pre i posle obrade
  • 🧠 Digitalna obrada zvuka pomoću filtara iz scipy.signal
  • 🗂️ Jednostavno učitavanje i čuvanje fajlova kroz grafički interfejs
  • ⚙️ Automatska integracija sa FFmpeg-om za konverziju i sinhronizaciju audio/video sloja

🖥️ Instalacija

Da bi program radio ispravno, potrebno je instalirati Python 3 i nekoliko dodatnih paketa.
Na Linux sistemima (Ubuntu, Debian, Mint) dovoljan je jedan red u terminalu:

sudo apt update && sudo apt install -y python3 python3-pip ffmpeg alsa-utils portaudio19-dev libasound2-dev libavcodec-extra && python3 -m pip install --upgrade pip && pip install numpy scipy PyQt6 pyaudio pydub

Ovim se instaliraju sve neophodne zavisnosti:

  • ffmpeg za audio/video obradu
  • PyQt6 za GUI
  • numpy, scipy za digitalnu obradu signala
  • pyaudio, pydub za rad sa zvučnim zapisima

🧩 Kako se koristi

  1. Pokrenite program komandom: python3 pojacavanje_zvuka.py
  2. Kliknite na “Učitaj video” i izaberite željeni MP4 fajl.
  3. Podesite nivo pojačanja pomoću klizača ili ručnog unosa.
  4. Pritisnite “Pojačaj” – program automatski obrađuje audio i snima novi video fajl.
  5. Po želji možete pregledati rezultat u samom interfejsu.

Sve operacije se odvijaju lokalno, bez potrebe za internet konekcijom.


⚡ Prednosti u odnosu na druge alate

Za razliku od mnogih online servisa koji zahtevaju upload i kompresiju, ovaj program radi potpuno lokalno i zadržava pun kvalitet originalnog video zapisa.
Koristi matematičku analizu signala (IIR filtere, normalizaciju i kompresiju dinamičkog opsega) kako bi pojačavanje bilo čisto, bez distorzije i šuma.


🧠 Tehnologije u pozadini

Program se oslanja na sledeće biblioteke i tehnologije:

BibliotekaNamena
PyQt6Grafički interfejs (GUI)
pydubAudio segmentacija i formatiranje
pyaudioReprodukcija i snimanje zvuka
scipy.signalFiltriranje i obrada signala
numpyMatematičke operacije
FFmpegSpajanje i konverzija audio/video zapisa

🧾 Zaključak

Ovaj program predstavlja jednostavno, ali moćno rešenje za pojačavanje i obradu zvuka u MP4 video fajlovima.
Korisnicima pruža kontrolu nad kvalitetom i intenzitetom zvuka, bez potrebe za složenim alatima za montažu.
Savršen je za podkaste, tutorijale, video snimke sa slabim tonom i sve druge situacije kada je glas bitniji od slike.


Autor: Abel Software
Licenca: Besplatno za ličnu i obrazovnu upotrebu
Tehnologija: Python 3, PyQt6, FFmpeg
Platforme: Linux, Windows, macOS


Programski kod za xzvuk.py

# xzvuk.py 
# Da bi program radio ispravno, potrebno je instalirati Python 3 i nekoliko dodatnih paketa.
# Na Linux sistemima (Ubuntu, Debian, Mint) dovoljan je jedan red u terminalu:
# sudo apt update && sudo apt install -y python3 python3-pip ffmpeg alsa-utils portaudio19-dev libasound2-dev libavcodec-extra && python3 -m pip install --upgrade pip && pip install numpy scipy PyQt6 pyaudio pydub

import sys
import numpy as np
from scipy.signal import iirpeak, lfilter
from PyQt6.QtWidgets import (
    QApplication, QWidget, QLabel, QPushButton, QVBoxLayout, QHBoxLayout,
    QSlider, QFileDialog, QLineEdit, QMessageBox, QProgressBar
)
from PyQt6.QtCore import Qt, QTimer
from pydub import AudioSegment
import threading
import pyaudio
import subprocess
import tempfile
import os
import math
import traceback

class EQPlayer(QWidget):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("Audio EQ Player (MP4 -> WAV Auto)")
        self.resize(900, 420)

        # EQ frekvencije
        self.freqs = [60, 150, 250, 500, 1000, 2000, 4000, 6000, 8000, 12000]
        self.sliders = []
        self.vu_meters = []
        self.global_slider = None

        self.audio_segment = None
        self.audio_thread = None
        self.stream = None
        self.p = None
        self.stop_audio = threading.Event()
        # current_chunk normalized in -1..1 floats (used for VU)
        self.current_chunk = np.zeros(512, dtype=np.float32)

        self.timer = QTimer()
        self.timer.setInterval(50)
        self.timer.timeout.connect(self.update_vu_meters)

        self.wav_temp = None

        self.setup_ui()

    def setup_ui(self):
        main_layout = QVBoxLayout()

        # EQ sliders + VU metri
        eq_layout = QHBoxLayout()
        for f in self.freqs:
            vbox = QVBoxLayout()
            vu = QProgressBar()
            vu.setOrientation(Qt.Orientation.Vertical)
            vu.setMinimum(0)
            vu.setMaximum(100)
            vu.setValue(0)
            self.vu_meters.append(vu)
            vbox.addWidget(vu)

            slider = QSlider(Qt.Orientation.Vertical)
            slider.setMinimum(-12)
            slider.setMaximum(12)
            slider.setValue(0)
            self.sliders.append(slider)
            vbox.addWidget(QLabel(f"{f} Hz"))
            vbox.addWidget(slider)
            eq_layout.addLayout(vbox)

        # Global gain
        gain_layout = QVBoxLayout()
        gain_layout.addWidget(QLabel("Global gain (dB)"))
        self.global_slider = QSlider(Qt.Orientation.Vertical)
        self.global_slider.setMinimum(-12)
        self.global_slider.setMaximum(12)
        self.global_slider.setValue(0)
        gain_layout.addWidget(self.global_slider)
        eq_layout.addLayout(gain_layout)

        main_layout.addLayout(eq_layout)

        # MP4 fajl i izlazni MP4
        file_layout = QVBoxLayout()
        self.mp4_edit = QLineEdit()
        btn_mp4 = QPushButton("Izaberi MP4 fajl")
        btn_mp4.clicked.connect(self.select_mp4)
        file_layout.addWidget(self.mp4_edit)
        file_layout.addWidget(btn_mp4)

        self.output_edit = QLineEdit()
        btn_out = QPushButton("Izaberi izlazni MP4")
        btn_out.clicked.connect(lambda: self.izaberi_fajl(self.output_edit, save=True))
        file_layout.addWidget(self.output_edit)
        file_layout.addWidget(btn_out)

        main_layout.addLayout(file_layout)

        # Dugmad
        btn_layout = QHBoxLayout()
        btn_preview = QPushButton("Preview WAV")
        btn_preview.clicked.connect(self.preview)
        btn_save = QPushButton("Primeni EQ na MP4")
        btn_save.clicked.connect(self.apply_to_mp4)
        btn_stop = QPushButton("Stop")
        btn_stop.clicked.connect(self.stop)
        btn_layout.addWidget(btn_preview)
        btn_layout.addWidget(btn_save)
        btn_layout.addWidget(btn_stop)

        main_layout.addLayout(btn_layout)
        self.setLayout(main_layout)

    def select_mp4(self):
        fajl, _ = QFileDialog.getOpenFileName(self, "Izaberi MP4 fajl", "", "MP4 fajlovi (*.mp4)")
        if not fajl:
            return
        self.mp4_edit.setText(fajl)
        # automatski ekstraktuj WAV
        tmp = tempfile.NamedTemporaryFile(suffix=".wav", delete=False)
        tmp.close()
        self.wav_temp = tmp.name
        try:
            # ne potiskujemo stderr/ stdout - da vidimo ako ffmpeg javi grešku
            subprocess.run([
                "ffmpeg", "-y", "-i", fajl, "-vn", "-acodec", "pcm_s16le", self.wav_temp
            ], check=True)
            self.audio_segment = AudioSegment.from_wav(self.wav_temp)
        except Exception as e:
            QMessageBox.warning(self, "Greška", f"Ne mogu da ekstraktujem audio:\n{e}")
            self.audio_segment = None
            # obriši tmp ako je loš
            try:
                if os.path.exists(self.wav_temp):
                    os.remove(self.wav_temp)
            except:
                pass
            self.wav_temp = None

    def izaberi_fajl(self, line_edit, save=False):
        if save:
            fajl, _ = QFileDialog.getSaveFileName(self, "Izaberi izlazni MP4", "", "MP4 fajlovi (*.mp4)")
        else:
            fajl, _ = QFileDialog.getOpenFileName(self, "Izaberi fajl", "", "MP4 fajlovi (*.mp4)")
        if fajl:
            line_edit.setText(fajl)

    def apply_eq(self, samples, rate):
        """
        samples: numpy float32 u opsegu -1.0..1.0
        vraća: numpy float32 u opsegu -1.0..1.0
        """
        out = samples.copy()
        # mali stabilni Q
        for slider, f in zip(self.sliders, self.freqs):
            g_db = slider.value()
            # normalizuj frekvenciju za iirpeak: 0..1 (1 = Nyquist)
            nyq = rate / 2.0
            w0 = float(f) / nyq
            if not (0 < w0 < 1.0):
                continue
            try:
                b, a = iirpeak(w0, Q=1.0)
                band = lfilter(b, a, out)
                gain = 10 ** (g_db / 20.0)
                out += band * (gain - 1.0)  # dodaj pojačanu komponentu
            except Exception:
                # ako filter crkne, preskoči ga (ne sme da slomi reprodukciju)
                continue
        return out

    def preview(self):
        if not self.audio_segment:
            QMessageBox.warning(self, "Greška", "Izaberi MP4 fajl")
            return
        self.stop()
        self.stop_audio.clear()
        self.audio_thread = threading.Thread(target=self.audio_loop, daemon=True)
        self.audio_thread.start()
        self.timer.start()

    def audio_loop(self):
        """
        Reprodukcija u posebnom threadu. Normalizujemo podatke u -1..1, obrađujemo,
        pa konvertujemo u int16 pre pisanja u stream.
        """
        try:
            self.p = pyaudio.PyAudio()
            # pripremi uzorke iz audio_segment-a
            sample_width = self.audio_segment.sample_width  # bytes per sample (1,2...)
            frame_rate = self.audio_segment.frame_rate
            channels = self.audio_segment.channels

            # dobavi raw uzorke kao numpy array (integers)
            raw = np.array(self.audio_segment.get_array_of_samples())

            # ukoliko su stereo, avg/mono mixing
            if channels > 1:
                try:
                    raw = raw.reshape((-1, channels)).mean(axis=1)
                except Exception:
                    # ako reshape ne uspe (neparan broj uzoraka), isprazni kraj
                    n_frames = (len(raw) // channels) * channels
                    raw = raw[:n_frames].reshape((-1, channels)).mean(axis=1)

            # normalizacija u [-1, 1] prema sample_width
            max_val = float(2 ** (8 * sample_width - 1))
            samples = raw.astype(np.float32) / max_val

            # odredi PyAudio format
            if sample_width == 1:
                pa_format = pyaudio.paInt8
            elif sample_width == 2:
                pa_format = pyaudio.paInt16
            else:
                # fallback na float32 (ako ne podržava)
                pa_format = pyaudio.paFloat32

            # otvori stream (mono)
            try:
                self.stream = self.p.open(format=pa_format,
                                          channels=1,
                                          rate=frame_rate,
                                          output=True,
                                          frames_per_buffer=4096)
            except Exception as e:
                QMessageBox.warning(self, "Greška", f"Ne mogu da otvorim audio stream:\n{e}")
                return

            chunk_size = 4096
            pos = 0
            total = len(samples)
            while pos < total and not self.stop_audio.is_set():
                chunk = samples[pos:pos+chunk_size]
                if chunk.size == 0:
                    break
                # obrada u float opsegu -1..1
                chunk_eq = self.apply_eq(chunk, frame_rate)
                # global gain
                gain = 10 ** (self.global_slider.value() / 20.0)
                chunk_eq = chunk_eq * gain
                # čuvamo malo podatke za VU (rezervisano 512)
                # ako chunk_eq je manji, popuni nulama
                csize = min(512, chunk_eq.size)
                buf = np.zeros(512, dtype=np.float32)
                buf[:csize] = chunk_eq[:csize]
                self.current_chunk = buf

                # konvertujemo u int16 pre pisanja
                out_int16 = np.int16(np.clip(chunk_eq * 32767.0, -32768, 32767))
                try:
                    self.stream.write(out_int16.tobytes())
                except Exception as e:
                    # napiši grešku u konzolu i prekini reprodukciju
                    print("Audio stream write error:", e)
                    traceback.print_exc()
                    break

                pos += chunk_size

        finally:
            # uvek pokušaj lepo da zatvoriš stream i PyAudio
            try:
                if self.stream is not None:
                    try:
                        self.stream.stop_stream()
                    except Exception:
                        pass
                    try:
                        self.stream.close()
                    except Exception:
                        pass
                    self.stream = None
            except Exception:
                pass
            try:
                if self.p is not None:
                    try:
                        self.p.terminate()
                    except Exception:
                        pass
                    self.p = None
            except Exception:
                pass

    def stop(self):
        self.stop_audio.set()
        self.timer.stop()

    def update_vu_meters(self):
        if self.audio_segment is None:
            return
        # radi FFT nad self.current_chunk (float -1..1)
        try:
            fft_vals = np.abs(np.fft.rfft(self.current_chunk, n=512))
            freqs_fft = np.fft.rfftfreq(512, 1/self.audio_segment.frame_rate)
            for i, f in enumerate(self.freqs):
                idx = np.argmin(np.abs(freqs_fft - f))
                val = 20 * np.log10(fft_vals[idx] + 1e-9)  # dB
                # normalizujemo očekivani opseg -60..0 dB u 0..100
                val_norm = int(np.clip((val + 60)/60*100, 0, 100))
                self.vu_meters[i].setValue(val_norm)
        except Exception:
            # ne sme da sruši GUI
            pass

    def apply_to_mp4(self):
        original_mp4 = self.mp4_edit.text()
        output_mp4 = self.output_edit.text()
        if not original_mp4 or not output_mp4 or not self.wav_temp:
            QMessageBox.warning(self, "Greška", "Morate izabrati sve fajlove")
            return

        # pripremi normalizovane uzorke
        sample_width = self.audio_segment.sample_width
        channels = self.audio_segment.channels
        raw = np.array(self.audio_segment.get_array_of_samples())
        if channels > 1:
            raw = raw.reshape((-1, channels)).mean(axis=1)
        max_val = float(2 ** (8 * sample_width - 1))
        samples = raw.astype(np.float32) / max_val

        # obrađeno
        processed = self.apply_eq(samples, self.audio_segment.frame_rate)
        processed *= 10 ** (self.global_slider.value()/20.0)

        # konvertuj u int16 za WAV
        out_int16 = np.int16(np.clip(processed * 32767.0, -32768, 32767))

        tmp_out = tempfile.NamedTemporaryFile(suffix=".wav", delete=False)
        tmp_out.close()
        try:
            out_segment = AudioSegment(
                out_int16.tobytes(),
                frame_rate=self.audio_segment.frame_rate,
                sample_width=2,   # eksportujemo kao 16-bit
                channels=1
            )
            out_segment.export(tmp_out.name, format="wav")

            # zameni audio u originalnom MP4
            subprocess.run([
                "ffmpeg", "-y", "-i", original_mp4, "-i", tmp_out.name,
                "-c:v", "copy", "-map", "0:v:0", "-map", "1:a:0",
                output_mp4
            ], check=True)
            QMessageBox.information(self, "Uspeh", f"MP4 sa poboljšanim zvukom sačuvan:\n{output_mp4}")
        except subprocess.CalledProcessError as e:
            QMessageBox.warning(self, "FFmpeg greška", f"FFmpeg je vratio grešku:\n{e}")
        except Exception as e:
            QMessageBox.warning(self, "Greška", f"Nešto je pošlo po zlu:\n{e}")
        finally:
            try:
                if os.path.exists(tmp_out.name):
                    os.remove(tmp_out.name)
            except:
                pass

if __name__ == "__main__":
    app = QApplication(sys.argv)
    window = EQPlayer()
    window.show()
    sys.exit(app.exec())

#MIT_License.txt
MIT License

Copyright (c) [2025] [Aleksandar Maričić]

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE. 

By Abel

Leave a Reply

Your email address will not be published. Required fields are marked *