Consider the binomial series for the principal square root:
$$(1 + z)^{1/2} = \sum_{n=0}^\infty a_n z^n, \quad a_n = \binom{1/2}{n},$$
which converges absolutely for $|z| < 1$. Define
$$B = \sum_{n=0}^\infty a_n A^n.$$
Let $r = \|A\| < 1$. The series converges absolutely in the operator norm because $S(r) := \sum_{n=0}^\infty |a_n| r^n < \infty$ (the radius of convergence is 1).
To verify $B^2 = I + A$, compute the square via the Cauchy product:
$$B^2 = \left( \sum_{n=0}^\infty a_n A^n \right) \left( \sum_{m=0}^\infty a_m A^m \right) = \sum_{k=0}^\infty \left( \sum_{i+j=k} a_i a_j \right) A^k.$$
The double series $\sum_{i,j=0}^\infty |a_i| |a_j| r^{i+j} = S(r)^2 < \infty$ converges absolutely in the Banach algebra $\mathcal{L}(X)$, justifying the rearrangement by powers of $A$ (Tonelli for series).
The coefficients satisfy $\sum_{i+j=k} a_i a_j = \binom{1}{k}$ (the coefficient of $z^k$ in $(1 + z)^{1/2} \cdot (1 + z)^{1/2} = 1 + z$) by the generalized Vandermonde identity:
$$\sum_{i=0}^k \binom{1/2}{i} \binom{1/2}{k-i} = \binom{1}{k} = \begin{cases}
1 & k=0, \\
1 & k=1, \\
0 & k \geq 2,
\end{cases}$$
matching the expansion of $1 + z$. Thus, $B^2 = I + A$.
Alternatively, via partial sums: Let $B_N = \sum_{n=0}^N a_n A^n$ and $T_N = \sum_{n=N+1}^\infty |a_n| r^n$, so $S(r) = S_N + T_N < \infty$ with $S_N = \sum_{n=0}^N |a_n| r^n$. Then
$$B^2 - (I + A) = (B - B_N) B + B_N (B - B_N),$$
so
$$\|B^2 - (I + A)\| \leq \|B - B_N\| (\|B\| + \|B_N\|) \leq 2 S(r) T_N \to 0 \quad \text{as } N \to \infty$$
(since $\|B - B_N\| \leq T_N$, $\|B\| \leq S(r)$, and $\|B_N\| \leq S(r)$), which yields $B^2 = I + A$.
For an explicit rate, note that $|a_n| \sim C n^{-3/2}$ for some $C > 0$, yielding $T_N \lesssim C' r^{N+1} / \sqrt{N}$ for some $C' > 0$, so
$$\|B^2 - (I + A)\| \lesssim \frac{C'' r^{N+1}}{\sqrt{N}}$$
for some $C'' > 0$.
If $f(z) = \sum c_n z^n$ has radius $R > \|A\|$, then $f(A) = \sum c_n A^n$ converges, and products follow from Cauchy convolution. Here, $((1 + z)^{1/2})^2 = 1 + z$ implies $B^2 = I + A$.
If the spectral radius $\rho(A) < 1$ (not necessarily $\|A\| < 1$), then the binomial series defines $B = (I + A)^{1/2}$ in norm and $B^2 = I + A$. By Gelfand's formula $\rho(A) = \lim_{n \to \infty} \|A^n\|^{1/n}$, pick any $c$ with $\rho(A) < c < 1$; then there exists $N$ such that $\|A^n\|^{1/n} \leq c$ for all $n \geq N$, hence $\|A^n\| \leq c^n$ for $n \geq N$. Set $M := \max_{0 \leq n < N} \|A^n\| / c^n$; then $\|A^n\| \leq M c^n$ for all $n \geq 0$. This gives the norm convergence of $\sum a_n A^n$ immediately, and the same Cauchy-product argument gives $B^2 = I + A$.
Moreover, since $\rho(A) < 1$ implies $\sigma(I + A) \subset \{z : \operatorname{Re} z > 0\}$, there is a unique $B$ with $\sigma(B) \subset \{ \operatorname{Re} z > 0\}$ and $B = I + O(A)$ such that $B^2 = I + A$; the series defines exactly this principal square root. $B$ also commutes with $A$.
Newton's method:
$$B_{k+1} = \frac{1}{2} \left( B_k + (I + A) B_k^{-1} \right), \quad B_0 = I,$$
is well-defined—if $\|A\| < 1$ and $B_0 = I$, then $\|B_k - I\|$ stays small; whenever $\|B_k - I\| < 1$ we have $B_k^{-1} = \sum_{n \geq 0} (I - B_k)^n$ (Neumann series)—and converges quadratically in norm to the same $B$.