[proofplan]
We use the standard dense coefficient algorithms: coefficientwise addition, schoolbook convolution for multiplication, and long division for quotient and remainder. The only point requiring care is runtime: the number of coefficient operations is polynomial in the dense input lengths, and the stated straight-line size bound ensures that all intermediate field encodings remain polynomially long. After each computation we trim trailing zero coefficients using the polynomial-time zero test, producing the canonical dense representation. For division, the leading coefficient of $g$ is nonzero by canonicality, so each division step is legitimate, and the usual degree descent gives existence and uniqueness of the quotient and remainder.
[/proofplan]
[step:Introduce input parameters and the canonicalization procedure]
Let the canonical dense inputs be
\begin{align*}
f=\sum_{i=0}^{a} u_i x^i,\qquad g=\sum_{j=0}^{b} v_j x^j,
\end{align*}
where $a,b \ge 0$, and where $u_a \ne 0$ if $f \ne 0$ and $v_b \ne 0$ if $g \ne 0$. Let $N$ denote the total bit-length of the two input dense encodings, including the lengths of all encoded coefficients and the list-length data. Since each coefficient contributes at least one symbol to a dense list, the integers $a+1$ and $b+1$ are at most $N$.
Define the canonicalization procedure on a finite coefficient list $(c_0,\dots,c_e)$ as follows. Starting from $i=e$ and descending, use the zero-test algorithm in $k$ to find the largest index $i$ with $c_i \ne 0$, if such an index exists. If no such index exists, output the one-term list $(0)$; otherwise output $(c_0,\dots,c_i)$. This procedure makes at most $e+1$ zero tests and therefore runs in polynomial time whenever $e$ and all coefficient encoding lengths are polynomially bounded in $N$.
[/step]
[step:Compute sums by coefficientwise field addition and trim trailing zeros]
For addition, set $e=\max\{a,b\}$. Define coefficients $w_i \in k$ for $0 \le i \le e$ by
\begin{align*}
w_i=\tilde u_i+\tilde v_i,
\end{align*}
where $\tilde u_i=u_i$ for $0 \le i \le a$ and $\tilde u_i=0$ for $a<i\le e$, while $\tilde v_i=v_i$ for $0 \le i \le b$ and $\tilde v_i=0$ for $b<i\le e$. Then
\begin{align*}
f+g=\sum_{i=0}^{e} w_i x^i.
\end{align*}
The algorithm performs at most $e+1 \le N$ additions in $k$. Every coefficient $w_i$ is obtained from the input coefficients by a straight-line computation of length at most $1$, so its encoding length is at most $P(N+1)$. Applying the canonicalization procedure from the previous step therefore takes polynomial time in $N$ and returns the canonical dense encoding of $f+g$.
[/step]
[step:Compute products by the schoolbook convolution formula]
For multiplication, define coefficients $z_\ell \in k$ for $0 \le \ell \le a+b$ by
\begin{align*}
z_\ell=\sum_{i=0}^{\ell} \alpha_{i,\ell-i},
\end{align*}
where $\alpha_{i,\ell-i}=u_i v_{\ell-i}$ if $0 \le i \le a$ and $0 \le \ell-i \le b$, and $\alpha_{i,\ell-i}=0$ otherwise. Then distributivity in the [polynomial ring](/page/Polynomial%20Ring) gives
\begin{align*}
fg=\sum_{\ell=0}^{a+b} z_\ell x^\ell.
\end{align*}
The algorithm computes each $z_\ell$ by multiplying all admissible pairs $u_i v_{\ell-i}$ and summing them. The number of field multiplications is at most $(a+1)(b+1)$, and the number of field additions is at most $(a+b+1)\max\{a+1,b+1\}$. Since $a+1$ and $b+1$ are at most $N$, the total number of field operations is bounded by a polynomial in $N$.
Each coefficient $z_\ell$ is obtained from the input coefficients by a straight-line computation of length bounded by a polynomial in $N$. Hence the size-bound hypothesis gives a polynomial bound in $N$ for every intermediate and output coefficient encoding. Because the field operation algorithms run in time polynomial in the lengths of their input encodings, the whole convolution computation runs in polynomial time in $N$. Finally, canonicalizing the list $(z_0,\dots,z_{a+b})$ uses at most $a+b+1 \le 2N$ zero tests on polynomially bounded encodings, so it also runs in polynomial time. The resulting list is the canonical dense encoding of $fg$.
[guided]
The dense multiplication algorithm is the usual coefficient convolution. The coefficient of $x^\ell$ in the product comes from all pairs of terms $u_i x^i$ and $v_j x^j$ with $i+j=\ell$. Thus, for each $0 \le \ell \le a+b$, we define $z_\ell \in k$ by summing exactly those products $u_i v_j$ whose indices satisfy $i+j=\ell$. Written with one index, this is
\begin{align*}
z_\ell=\sum_{i=0}^{\ell} \alpha_{i,\ell-i},
\end{align*}
where $\alpha_{i,\ell-i}=u_i v_{\ell-i}$ when the indices are in range, and $\alpha_{i,\ell-i}=0$ otherwise. Then expanding $f$ and $g$ and collecting equal powers of $x$ gives
\begin{align*}
fg=\sum_{\ell=0}^{a+b} z_\ell x^\ell.
\end{align*}
Now we account for runtime. There are at most $(a+1)(b+1)$ products of input coefficients, and this is at most $N^2$ because the dense input lists have lengths $a+1$ and $b+1$, both bounded by the total input length $N$. The additions used to collect those products are also polynomially many. This proves that the number of field operations is polynomial in $N$.
The remaining issue is coefficient growth. Polynomial-time field operations alone only say that one operation is fast relative to the lengths of its inputs; they do not by themselves prevent later inputs from becoming exponentially long. This is exactly why the theorem assumes the straight-line size bound. Each coefficient $z_\ell$ is produced by a straight-line computation from the original input coefficients using polynomially many additions and multiplications, so its encoding length is bounded by $P(N+m)$ for some polynomially bounded operation count $m$. Therefore every field operation is applied to polynomial-length encodings, and each such operation takes polynomial time in $N$.
Finally, the raw convolution list may have trailing zero coefficients because cancellation can occur in $k$. We therefore apply the canonicalization procedure: starting from the last coefficient, use the zero test to find the last nonzero coefficient, or return $(0)$ if all coefficients vanish. There are at most $a+b+1 \le 2N$ coefficients to test, and each has polynomially bounded length, so trimming is polynomial-time. The output is consequently the canonical dense encoding of $fg$.
[/guided]
[/step]
[step:Run long division using the nonzero leading coefficient of the divisor]
Assume now that $g \ne 0$. Since the input for $g$ is canonical, its leading coefficient $v_b$ satisfies $v_b \ne 0$, so division by $v_b$ is a permitted field operation.
Initialize a mutable remainder polynomial $R_0 \in k[x]$ by $R_0=f$ and initialize a quotient polynomial $Q_0 \in k[x]$ by $Q_0=0$. At the start of each loop iteration, store $R_t$ as a dense list of length at most $a+1$ and determine whether $R_t=0$ and, if not, determine $\deg R_t$ by scanning this list downward with the zero-test algorithm from the canonicalization procedure. For $t=0,1,\dots$, while $R_t \ne 0$ and $\deg R_t \ge b$, let $d_t=\deg R_t$, let $\rho_t \in k$ be the coefficient of $x^{d_t}$ in $R_t$, and define
\begin{align*}
c_t=\rho_t/v_b.
\end{align*}
Set
\begin{align*}
Q_{t+1}=Q_t+c_t x^{d_t-b},
\end{align*}
and
\begin{align*}
R_{t+1}=R_t-c_t x^{d_t-b}g.
\end{align*}
The coefficient of $x^{d_t}$ in $R_{t+1}$ is $\rho_t-c_t v_b=0$, so either $R_{t+1}=0$ or $\deg R_{t+1}<d_t$. Thus the degree of the mutable remainder strictly decreases at every iteration. Since initially $\deg R_0 \le a$, the loop has at most $a+1 \le N$ iterations.
At termination, define $q=Q_T$ and $r=R_T$, where $T$ is the number of loop iterations. By summing the update identity
\begin{align*}
R_{t+1}=R_t-(Q_{t+1}-Q_t)g
\end{align*}
over all iterations, we obtain
\begin{align*}
f=R_0=Q_Tg+R_T=qg+r.
\end{align*}
The stopping condition gives $r=0$ or $\deg r<b=\deg g$.
The algorithm stores $R_t$ densely with length at most $a+1$ and stores $Q_t$ densely with length at most $\max\{1,a-b+1\}$. In one loop iteration it computes one field division, scans at most $a+1 \le N$ coefficients to determine the current degree, and then updates at most $b+1 \le N$ coefficients of the remainder, so the total number of field operations and zero tests over all iterations is bounded by a polynomial in $N$. Every coefficient appearing in any $Q_t$ or $R_t$ is obtained from the input coefficients by a straight-line computation of polynomial length using the allowed field operations and divisions by the already verified nonzero element $v_b$. Hence all intermediate encodings actually produced by the operation algorithms have polynomial length by the size-bound hypothesis, and the field operations and zero tests performed by the loop take polynomial time in $N$.
Canonicalizing the final dense coefficient lists of $q$ and $r$ uses polynomially many zero tests on polynomially bounded encodings. Therefore the [division algorithm](/theorems/725) runs in polynomial time and returns canonical dense encodings of polynomials $q,r$ satisfying the required division identity and degree condition.
[/step]
[step:Prove uniqueness of the quotient and remainder]
Suppose that $q_1,r_1,q_2,r_2 \in k[x]$ satisfy
\begin{align*}
f=q_1g+r_1=q_2g+r_2,
\end{align*}
with $r_1=0$ or $\deg r_1<\deg g$, and $r_2=0$ or $\deg r_2<\deg g$. Subtracting the two identities gives
\begin{align*}
(q_1-q_2)g=r_2-r_1.
\end{align*}
If $q_1-q_2 \ne 0$, then, because $k$ is a field and $g \ne 0$, the degree of the left-hand side is
\begin{align*}
\deg((q_1-q_2)g)=\deg(q_1-q_2)+\deg g \ge \deg g.
\end{align*}
The right-hand side has degree strictly less than $\deg g$ unless it is zero, because it is the difference of two polynomials each either zero or of degree less than $\deg g$. This is impossible. Hence $q_1-q_2=0$, so $q_1=q_2$, and then the displayed identity gives $r_1=r_2$.
Combining existence from the long-division algorithm, uniqueness from this argument, and the polynomial runtime bounds for addition, multiplication, division, and canonicalization proves the theorem.
[/step]