Acoustic system identification, which aims at estimating the channel impulse response from a source of interest to the microphone position, plays an important role in many applications, e.g., echo cancellation for full-duplex speech communication. Generally, an acoustic channel impulse response is modeled as a linear finite-impulse-response (FIR) filter, so the objective of system identification is to identify it. While much effort has been devoted to this topic over the last five decades, identifying the room FIR filters accurately with only a small number of observation data snapshots remains a significant challenge. This paper studies this problem and proposes to model the acoustic impulse response, i.e., the FIR filter, with a tensor decomposition, which can be expressed as a multidimensional Kronecker product of a series of shorter filters. Then, a partially time-varying model is applied to acoustic system identification, where the global filter is decomposed into two parts: a time-invariant part, which captures the common properties of acoustic channels, and a time-varying part, which, as its name indicates, represents the components of acoustic channels that change with time. During the identification process, the time-invariant filters can be identified or learned in advance, while the time-varying filters are optimized through an iterative procedure. Simulation results demonstrate that the proposed technique can achieve better acoustic system identification performance with a small number of data snapshots.